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PREFACE 



Participants in all parts of the study described in the five 
volumes of this report are listed below. 

The project staff and their areas of responsibility were: 
Name m Responsibility 



Donald W. Fisher, Ph.D. 
Executive Director, AAPA/APAP 



Project Director 



Mary Jane Crain 
Research Associate 



Assessment of the 
appl icability of the * 
University of Wisconsin's 
Individual Physician 
Profile (IPP) program for 
physician assistants 

Maintenance of a roster 
of CME programs for 
physician assistants 

Design of a system of 
CME program accreditation 



Jane Faulman, Ph.D. 
ResearcfW^sociate 



Verification of the role 
delineation for the entry 
level general i st position 

Physician assistant 
position classification 

Development of a self- 
assessment tool 



The Project Officer for the study, from the Division 

Associated Health Professions, was Louis A. Quatrano, Ph.D. 

Secretarial and administrative support was provided by: 

Veronica Marshall r 
Karen Hummer 
Linda Geary 

The project staff consulted a measurement and evaluation 
specialist, Dr. Richard C. Cox of Pittsburgh, Pennsylvania, to assist 



with the IPP and system of continuing medical education (CME) 
program accreditation portions of this study. Specifically, Dr. Cox 
designed a checklist for IPP participants and helped with the 
analysis and presentation of all- IPP checklist results! He also 
assisted the staff in the development of checklists distributed to 
physician assistants in attendance at selected CME programs throughout 

i 

the country. The data from these checklists was used in the develop- 
ment of a system of accreditation of physician assistant oriented CME 
programs. • * 

The development of a self-assessment tool was placed under 
the direction of a consultant to the project who is a specialist in 
test development. This consultant was Ayres D'Costa, Ph.D., 
Associate Professor of Health Professions Education at The Ohio State 
Uriversity in Columbus, Ohio. Under Dr. D'Costa's guidance, a self- 
assessment examination for physician assistants was developed. 
Dr. D'Costa planned and conducted all meetings at which the test 
specifications for the examination were delineated and test items 
were -prepared and revised. He was responsible for all computer output 
necessary to t]ie project. He designed an individualized, computer- 
generated test report wh^ch includes respondents 1 scale scores both 
numerically and graphically. 

In addition to the help of consultants, the project staff also 
benefited from the special expertise and insight of members of the 
Evaluation, Working, and Advisory Committees. Each of the committees 
had a specific role to play in the completion of this study. 



vii' - 

The Evaluation Committee worked primarily on the assessment of the 
applicability of IPP for physician assistants, the design of, a system of 
CME accreditation, and the maintenance of a roster of CME programs. 
Members of this committee reviewed the data collected about the Individual 
Physician Profile program, suggested uther information to be obtained, and 
made recommendations regarding the program's applicability for physician 
assistants. The Evaluation Committee had a major role in the development 
of instruments, the review of data, and the making of recommendations re- 
garding continuing medical education 'options and accreditation systems. 
Also, this committee reviewed the roster format for CME programs. 

Members of the Evaluation Committee included educational specialists 
competent in criterion-refer2nced measurement, design of instructional 
materials, evaluation methodology, and clinical simulation. The Academy's 
Professional and Continuing Education Committee had two representatives 
serving on the Evaluation Committee. The ten members of the Evaluation - 
Committee 'ere: 

* 

Philip G. Bashook, 'Ed. D. 
J Ch^ago, Illinois 

Robert J, Blakely 

Chicago, Illinois 7 

Sarah M. Dinham, Ph.D.- 
Tucson, Arizona 

Stephen C. Gladhart, Ph.D. 
^ Wichita, Kansas 

Thomas R. Godkins, P. A. 
Oklahoma City, Oklahoma 

Jan L. Hagen, M.S.W. 
Baltimore, Maryland 

Paul F. Moson, P.A.-C. * K 
Loretto, Pennsylvania 
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Robert R- Moutrie, Ph.D. 
Newark, New Jersey ^ 

John E. Ott, M.D. - , 

■ Washington, D.C. 

Paul S. Toth,' P.A.-C. ■ . / 
, Durham, North Carolina. ... 

The seven-member Working Committee worked closely with the project 

staff ^'n developing the Role Delineation for the Physician Assistant . This 

document was produced via the accomplishment of two tasks: verification of 

an earlier role delineation (iritluded in the Curriculum Resource Document. 

project) and determination of a position classification for the physician 

assistant profession. Members of this committee included practicing 

physician assistants, physicians, (in private practice and hospital settings) 

who employ physician assistants, faculty of physician assistant training 

programs, and one representative from the Curriculum Resource Document 

project. The members of the Working Committee were: 

Mack Bonner, Jr., M.D. 
New York, New York 

Trudy Jo Companiotte, P.A.-C. 
Nashville, Tennessee 

William E. g. de Alva, M.D. 
Denver, Colorado 

Carl E. Fasser, P.A.-C. 
Houston, Texas 

Stephen L. Joyner, P.A.-C. 
Ayden, North Carolina 

Allan B. Kunkel , M.D. 
Cleveland, Ohio 

Daniel 0. Myhre, P.A.-C. 
Spokane, Washington 

Representatives from major medical organizations with a significant 
interest in the physician assistant profession served on an Advisory Com- 
mittee to review materials and provide input to the staff and* the other 
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two committees for all phases of the study. Representatives from the two 

other commitjtees for the contract served as liaison members on the 

Advisory Committee, The members of this committee reviewed and provided 

advice on data, interim reports, and conclusions and recommendations 

about the role delineation for the physician assistant, the Individual 

Physician Profile program, the system of CME program accreditation, and 

the roster of CME programs for physician assistants. The members of 

thi| Advisory Committee were: 

€ Leo S. Bell, M.D., F.A.A.P., F.A.P.tf.A. 
San Mateo, California 
American Academy of Pediatrics u 

Pearl H. Dunkley, R.N, , Ed.D. . 
Kansas City, Missouri o ™ 

American Nurses 1 Association ° ^ 

Carl E. Fasser, P.A.-C. 
Houston, Texas 

Representative of the Working Committee 

' Dan P. Fox, P.A.-C. 
Oklahoma City, Oklahoma 
American Academy of Physician Assistants 

Thomas R. Godkins, P. A, 
Oklahoma City, Oklahoma 
Representative of the Evaluation Committee 

$ 

Rolf M. Gunnar, M.D., F.A.C.P. 
* Maywood, Illinois 
American Medical Association 

J. Rhodes Haverty, M.D* 
Atlanta, Georgia 

National Commission on Certification of 
Physician's Assistants 0 

Frances L. Horvath, M.D, 
St. Louis, Missouri 

Association of Physician Assistant Programs 

Joseph A. Intile, Jr., M.D., F.A.C.P. 

Oregon City, Oregon 

AmericSp Society of Internal Medicine 
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Robert Jewett, M.Ik 
: ' Dayton, Ohio * 

Association of .American Medical Colleges 

^ Raymond H. Murray, M.D., F.A.C.P. 
East Lansing, Michigan 
American College of Physicians 

* Dan A. Nye, M.D. 

Kearney, Nebraska . * ' 

Federation of State Medical Boards 
• of the United States 

Frederic 1. Schoen, M.D. 
Indianapol is, Indiana 
American Academy of Family Physicians 

Daniel R. Thomas* 
Chicago, Illinois 
American Hospital Association 

"Harold ZinteVM.D., F.A.C.S. K 
Chicago, Illinois ' 
American Qpllege of Surgeons 9 

Three groups of physician assistant practitioners and educators 

contributed to the development of the Self-Assessment Examination for 

Physician Assistants. The Test Specifications Committee provided input 

for the test specifications matrix, for item revision, and for future 

research. The six members of this Committee were: 



Carl E. F^sser, P.A.-C. [ 
Houston, Texas 

David ).. Glazer, M.A. * l£ ^ 

Atlanta, Georgia *>u ^ ' 

Allan B. Kunkfel , M.O. 
Cleveland, Ohio 

Laurie Lipsig, P.A.-C. 
Chicago, Illinois 

Thomas E. Piemme,.M.D. 
Washington, D.C. 

Judith B. Willis, M.A., P.A.-C. 
Kalamazoo, Michigan 
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*In May 1979; Thomas Atchison, Ed.D., replaced- Daniel Thomas at 
tn%^\merican Hospital Association. 
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A committee of 24 physician assistants m£t in^two v\rksfcops to 
develop .and revise test items. The large majority of the items on the 
exam were produced^by this group. The Workshop Item Writers were: * 

Donald A, Abrams, P;A. 

Jamaica Plain, Massachusetts 

« * 

Randall C, Bennett, P.A.-C 
Gainesville, Florida 

Scott Chavez, P.A.-C " * 
Las Vegas, Nevada \ 

Robert Christie, P.A.-C* 
Dayton, Ohio 

Linda Davies, P.A.-C 

Arlington, Virginia » / . * 

Dale B. Davis, P.A.-C. 
Springfield, Missouri 

Max Dawkins, P.A.-C. 
Greensburg, Indiana 

Robert prance, P.A.-C. 
Taylors, South Carolina* 

Edward Friedmann, P.A.-C. 
Mason City, Idwa 

George F. Hillegas, III, P.A.-C. 
^Baltimore, Maryland 

Norman Hoi ton, P.A.-C. 
Royal Oaft, Michigan • 

• 

Charles E. Horan, P.A.-C. 
Phoenix, Arizona 

Pa<*l Lombardo, P.A.-C. . . • „ * 

Dix Hills, New York * 

Q ' 

John McCarty, P'.A. 
Marshfield, Wisconsin 

Noel H. McFarlane, P.A.-C. 
Silver Spring, Maryland 

Dennis W. O'Dell , P.A.-C. 
Wailuku Maui, Hawaii 
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Leonard T. O'Neill, P.A.-C. 
Omaha, Nebraska 

Kenneth Ryther, P.A.-C. " 
Delta Junction, Alaska \ 

" Michael Sheldon, P.A.-C^ . * " 

Portland, Maine 

Valerie Staples, P.A.-C. • * 

Durham, Nor.th Carolina 

Valgene Valgoca, P.A.-C. • \ 

Omaha,/ Nebraska . ' 

Joseph Varano, P.A.-C. 
Philadelphia, Pennsylvania 

Cecil Walker, P.A.-C. 
y Carson, California ^ 

L. timothy Whitmore, P.A.-C. 
Richmond, Virginia 

9 Thirty, PAs were asked'to be Field Item Writers. Materials on 

writing test itsms were mailed *to them; and they~were requested to write 

items and forward them to the national office. The PAs asked to be Field 

Item Writers ware: 

Timothy Bauer, P.A.-C. 
Tomah, Wisconsin 

Walker-Boone, P.A.-C. 
Asheville, North Carolina 

Paul Cephus, P.A.-C. 
Houston, Texas 

Michelle Combs, P.A.-C. 
Lexington, Kentucky 

^ Wayne Cure, P. A. 

Coxsackie, New York 

' Laura Davis, , P.A.-C. J 
Advance/ North Carolina 

\ Marc Dicker, P.A.-C. - 

Wichita, Kansas 

David Fraser, P.A.-C. 
Denton, Texas • 
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ABSTRACT 



The purpose of this project was to develop a criterion-referenced 
self-assessment examination for physician assistants (PAs), using the Role 
Delineation as the basis, from which appropriate continuing education 
could be developed* The test development effort was undertaken with the 
help of Working Committees consisting of PAs and PA educators. A six- 
hour examination consisting of 315 items has been constructed using two 
try-outs. 

The domain of the examination is the comoetency skills and know- 
ledge expected of an entry-level general 1st PA. The domain has been 
defined in terms of two sets of scales: 17 Bole Scales and 28 Body 
System Scales. The interpretation of scores is based upon minimum com- 
petency scores decided upon by expert judgement using the Nedelsky 
Technique. 

Two innovative approaches were used in the implementation of this 
project. One involved the use of critical incidents in the generation 
of test items. The other involved the use of a three-factor conceptual 
model for continuing medical education (CME) using self-assessment 
examinations. It is the thesis of this model that CME must be based 
upon a combined analysis of practice (P) requirements, individual felt 
needs (N), and deficits identified by examination (E) scores. A four- 
page computer generated reporting system was developed and returned 
along with an Interpretive Leaflet as feedback to each PA who partici- 

pated in the Try-Out Exam. $ j> 

5* v ' \ 
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I. INTRODUCTION 



A. Background to Project 

The Self-Assessment project of the American Academy of Physician 
Assistants (AAPA) lies at the very core of its mission "to facilitate 
the recognition of the physician assistant as a professional dedicated 
to the delivery of quality care" (AAPA, 1978)- Quality looms as a 
major concern of this new profession . The National Center for Health 
Services Research (NCHSR, 1978) cited a 1976 estimate indicating that 
1200 physician assistants and nurse practitioners had been trained as 
a result of federal support since 1965, This 1978 NCHSR Report on 
nurse practitioners and physician assistants focuses on medical care 
utilization issues, particularly those emanating from current insurance 
reimbursement restrictions. The NCHSR Report recommended an interim 
100 percent reimbursement based upon the principle that (reimbursement) 
rates should be related ^o the service performed and not who performs 
the service* 

Quality medical care is based on the competence of the provider, 
but it also recognizes the principle that with.in a set of professional 
roles, a physician assistant (PA) can be the health care provider of 
choice over other health professionals- This principle may be described 
as "role appropriateness" and is somewhat akin to "professional special- 
. ization". 

B. Purpose of Report 

This Report documents the development of a self-assessment system 
by the AAPA for the continuing education of its members . The self- 
assessment system was envisaged as an integral component of a major con- 
tract supported by Health Resources Administration (DHEW) by which the 
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competencies requisite tor the entry-level general ist PA practitioner were 
verified, the ro3e of the PA delineated from that of other similar health 
professionals, and a system developed for providing, evaluating, and 
accrediting continuing education programs for PAs, 

The self-assessment ^ystem was, by necessity, a pilot effort since 
nothing like it existed for the PA profession. This is not surprising 
given that the first graduates from PA programs have less than five years 
in their practice. The self-assessment examination was explicity con- 
ceived in terms of a 300-item multiple-choice examination which would be 
carefully constructed with the help of expert committees and consultants, 
tried out, and tentatively utilized in a model continuing education pro- 
gram designed to ensure professional competence among PAs, 

Inasmuch as this Report highlights the processes and outcomes 
entailed in the development of the self-assessment system, it will 
opportunely be expected to serve also as its Technical Manual. The self- 
assessment project includes the following components: the development of 
test specifications based on the Role Delineation; the development of 
test items in conformance with these test specifications; the initial 
try-out and revision of these test items; the pilot testing of the revised 
test on a national sample of 100 PAs; the specification of minimum compe- 
tency standards; and the development of a computer-based scoring, CME 
reporting and documenting system* * 
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C Rationale of Project 

The self-assessment system is based upon certain axioms which are 
presumed self-evident* They are derived from a mul tidiscipl inary posture 
formulated on the basis of experience with health professionals* These 
axioms will now be listed and explained so as to provider back-drop for 
the projec$. 

i) PAs are professionals and can be held responsible 
for their~own ^educationa I maintenance 'and growth. 

A profession is based upon service and dedication to~certain human needs. 
Physi'cian ^assistants are like other health professionals in this respect.. 
Professionals are expected to be responsible experts who are often called 
upon to function at the frontiers of their disciplines by using judgment 
and discretion in the performance of their duties. It is difficult to 
assume responsibility for a professional because quality service must be 
individualized both to the consumer's needs and to th'e provider's capa- 
bilities. The quality of performance may be audited by peer judgments, 
but such audits tend to focus on matters of gross negligence and inepti- 
tude.' The purpose .of continuing education should be not^merely to ensure 
minimally acceptable services, but rather to foster high quality health 
care. Continuing education should therefore be based upon felt needs, 
and therefore responsible self-assessment seems to provide the best ariswer. 
Mandatory programs are often doomed to become predictable failures. 

ii) Given societal concerns for quality of health 
care and the current expectation of professional 
accountability > the AAPA„i/3 the most appropriate 
organization of the professvon to assume respon- 
sibility for momtoring the quality of continuing 
medical education programs available and the 
number of credits earned by each member. 

0 
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Just as individual professionals must ultimately be- responsible 
for their own learning, so must thfe profession monitor itself. However, 
both are* accountable to society and a system of ve^Tf icatibn is there- 
fore necessary.* The AAPA has developed such a CME recording system for 
PAs and it is planned to include self-assessment within this system. . 

The AAPA also enjoys distinct Advantages because it" speaks for 

the profession. The resources it enjoys go beyond membership dues, 

committee services, and technical input." The profession is youthful', 

vigorous, and enthusiastic in striving for its image and future. 

Hi) PAs are busy professionals and therefore need 
a continuing medical education (CME) system 
that is easy to access, convenient .to use, 
self-paced, and non-threatening. - 

Thd" self-assessment idea, using evaluative examinations as the basis 

for CME learning prescriptions, appears sound and reasonable. This is " 

because such examinations can bg packaged so as to be convenient and 

inexpensive to utilize. Each PA administers such examinations to him- 

self at his own convenience. As* experience- is generated by the profes- 

sion,'an integrated series of short examinations could be % made available. 

In turn the PA would select' units according to his professional interests 

and practice needs. The responses could then be scored by AAPA and x 

learning' prescriptions returned to the PA with suggestions for a variety 

of educational activities available to him. A PA could take parallel 

examinations on a certain unit several times in a, year until competency 

is attained. 

iv) The self -assessment program must be practice- 
based and practice -oriented with emphasis on 
critically needed skills rather than on 
esoteric topics selected by teachers or indi- 
cated by recent scientific breakthroughs. 



Practitioners are interested in their day-to-day problems and 
look for ways to deal with them 'effectively . New research findings, on 
the other hand, clearly lack diffusion among practitioners and often 
remain in library shelves unapplied. One reason for this is a lack of 
orientation towards the practitioner in publicizing such research* find- 
ings. It is difficult for a busy practitioner to derive relevance from' 
published research. Continuing education must therefore erflphastze £he 
translation of research finding's- into concrete ways by^which resecrch 
can be applied in .clinical practice. ^ 

Continuing education cannot be limited to new research findings 
alone. Many professionals feel the need for broadening their skills' 
as they progress in their interdisciplinary practices. They wish^ to^* 
understand how other experts think and function, if only* to appreciate 
referrals better. Some may even want *to broaden the scope of their 
services to patients. What is needed .therefore, may' be regular clinica^ 
skills. 'It is recognized that these lack the glamour of the new miracl 
drug or medical protedure. But to limit continuing education to the 
latter amounts to skirting the responsibility to enhance professional 
competence and thereby to ensure quality health care to society. 

The question* that remains is whether self-assessment exantfha- 

D 

tions can be made relevant to clinical practice. The typical multiple- 
choice test item ha's tended to be frustrating because of- ambiguity of 
the stimulus question or the trivial nature of its underlying content/ 
skill. This is unfortunately all too true. Good examinations are 
difficult to write and jequire an arduous process of revisions to 
develop. Recent successes in measurement with patient management 
problems {PMP) hold a distinct promise. The PMP is distinctly differ-' 
ent from the typical multiple-choice item in that it simulates ^ 

* • v. 
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clinical scenario and requires|the making of decisions very similar to 
those actually made in practice, 

'^^The self-assessment examination of the AAPA uses clinical scen- 
arios as the basis for test-items. Moreover, a unique philosophical 
stance was taken by requiring that all items be generated in terms of 
their relevance to critical skifvls linked to the PA role delineation, 
A mor.e detailed explanation of this unique procedure will be presented 
later in this Report, , ~~n 
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II. PROJECT PROCEDURES 

A. Strategy 7 

i) Utilize resources within profession as far • 
as possible. 

The self-assessment examination was programmed to be developed 
with the help of "working" committees rather than "policy-generating" 
committees. A "checks and balance" system was obtained by identifying 
two Committees, first, a Test Specifications Committee of 6 person's to 
discuss test specifications and to develop sample clinical scenarios and 
test items linked to the specifications matrix; and second, an Item 
Writers Committee of 24 persons who worked on the actual development and 
revision of the test items using the test specifications.. Both Commit- 
tees consisted entirely of PAs and PA educators. Professional measure- 
merit consultants facilitated the process of test development by making 
necessary test item development, test-scoring and item analysis resources 
available to the AAPA. Represented on the Test Specifications Committee 
were the National Board of Medical Examiners and the National Commission 
on the Certification of Physician Assistants, both of whom have worked, 
closely with the AAPA and the Association of Physician Assistant Programs 
CAPAP) in the development of this program. Available for try-out of the 
test were members of AAPA. and APAP attending the Seventh Annual Conference 
on Physician Assistants in Hollywood, Florida. 

The utilization of PA resources in test development, aside from 
assuring dedication to the program by the profession, makes for needed 
leadership development in a young profession. Such experience is avail- 
able for future capitalization and constitutes a valuable investment in 
the profession. 
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it) Ensure long-term acceptance of program, by 
emphasizing service to PAs. 

A self-assessment program is essentially a regular service that a 

profession develops for its members* A successful program is based 

upon membership confidence in the quality of the exams; its non-punitive * 

nature; its relevance to their professional needs; its availability, 

turnaround time and cost; and, above all, the quality of the feedback 

provided . A service-oriented program will 1 attend to these qualities m 

, because the target is more than the *ful fillment of .a governmental con- 
tract* Nothing is more aggravating to members than a central organiza r . 
tion that seems to feed itself on short-term contracts at the cost of' 
its membership. Loyalty -and solidarity of its membership is important 

-to the 'AAPA and for this reason, this self-assessment project was 
structured so that service would be kept in mind at all times* Every- 
time a PA was to be asked to provide data by responding to a try-ou.t^ 
version of the examination, the Committees asked themselves: .What . 
benefits can we provide the PA in return? 

B. Work Plan 

1. Contractual framework: Ine revised contract (February 9, 1979)" 
pre ided that a criterion-referenced self-assessment examination be devel- 
oped by utilizing the following critical steps: 

i) Obtain services of consultants with expertise in 
development of criterion-referenced self-assessment 
tools. 

ii) Select competency areas from the major responsibility 
domains of the entry-level general ist physician assis- 
tant, using additional criteria in the process of 
selecting topics for the self-assessment tool. 

iii) Using a Working Committee, establish the test descrip- 
tive scheme and generate items for the sei f-assessment 
tool . 
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iv) State the test's descriptive scheme which consti- 
tutes the self-assessment tool and identify the 
specific items. 

^y) Develop (includes testing) the self -assessment 
v < tool. Pilot test the'exam on 100 PAs. 

vi) Submit draft self-assessment tool and a descrip- 
tion of pilot testing results for review and 
approval by Project Officer. 

vii) Design a program by which the self-assessment 
too] is made available to physician assistants. 

vii i ) Prepare draft final report on the self- 
assessment tool including recommendations for 
its future use on profiling practitioners in 
the field and development of learning packages. 

Earlier, the AAPA had proposed a twelve-step scheme by which it 
indicated that the test specifications would be developed by a Test 
Specifications Committee of six experts in criterion-referenced testing 
and including representatives from the National Commission on Certifi- 
cation of Physician Assistants (NCCPA) and the National Board of Medical 
Examiners (NBME). The item development would be undertaken by content 
specialists at two workshops, the first of which would instruct them in 
the 'development of such test items. It was also hoped to be able to get 
content experts in the field to write items with the" help of written 
instructions and the test specifications. The Test Specifications 
Commijttee would then meet to review the test items and assemble the exam- 
inatiiDn. The exam would then be pilot-tested, results provided to the 
PAs, and the Test Specifications Committee convened a third time to make 
recommendations for future use of the self-assessment examination. 

2. The revised Work Plan: Early in March 1979, the AAPA hired 
Ayres D'Costa, Ph.'D., Associate Professor of Health Professions Education 
at The Ohio State University, to serve as the Consultant for the Project. 
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After some initial discussions among the project staff, the 
Consultant, and the Project Officer, a Schedule for Test Development was 
agreed upon* (See Table 1)- This Schedule recognized the need for an add 
t^onal try-out of the test items being developed. This try-out was 
scheduled-for^Appl 26, 1979 during the AAPA-APAP Convention in Hollywood, 
Florida- Working around this fixed date, the first Item Writers 1 Workshop 
was utilized to develop items and the second Item Writers 1 Workshop was 
scheduled to revise the items on the basis of the item analysis data and 
the comments received from PAs. All other aspects of the contractual 
framework were left intact* 

C. An overview of the Actual Work Schedule: 

Giver, a Test Specifications Committee (TSC) and an Item Writers 
Committee (IWC), each of whom would meet twice during the project 
period, it was decided to bring the Test Specifications Committee 
together early in April and once again towards the end of the project. 
All Committee meetings were called Workshops ^nd became intensive 
work-sessions designed to produce specified project products. 

The first TSC Workshop resulted in 1) prioritized lists of re- 
search and program objectives for the Self-Assessment Examination, 2) a 
test specifications matrix using the 11 role delineation areas along one 
axis and 5 skills levels along the other axis, 3) samples of test items 
generated from Critical Incidents/Scenarios, and 4) a Revised Schedule 
of activities planned for the project. 

The next landmark event was the Item Writers Workshop on 
April 16-18, 1979. Prior to coming together, the members of this 
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REVISED SCHEDULE FOR TEST DEVELOPMENT 



March 24 



Apr! I 3-5 



April 8 



Apr! I 16-18 



April 26 



April 27- 
May 2 
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Orientation materials sent to Test Specifi- 
cations Committee 

Test Specifications Committee #1 meets with 
project staff and consultant. Test objectives 
and specifications developed. 

Orientation materials sent to Item writers, 
both workshop participants and PAs who will 
develop test items in their practice settings. 
Both groups will be asked to write 10-15 test 
Items. The practice group will ma I I these to 
the Nattonal Office prior to May I. The work- 
shop group will bring these. quest I ons with 
them. 

Item Writers Workshop #| with project staff 
and consultants. Items will be written and 
reviewed. Each participant is expected to 
develop about 10-15 Items during the Work- 
shop. 

About 8 to 10 (non-parallel) test forms with 
about 30-40 I terns each will be tried out at 
the PA Convention In Hollywood, Florida. 
Matrix Sampling Approach will be used. The 
PAs -will review each item for readability, 
so'cial desirability, and relevance to 
pract ! ce. 

Consultants will review Item analyses data, 
as well as PA comments, to perform some pre- 
liminary revisions of Items. The revisions 
and Item analyses deta will be sent to each 
. item writer In order to request additional 
revisions based on medical content. Some 
Items will need to be dropped, new ones 
developed, m0 st will be revised- 
Item Writers Workshop #2. Discussion of 
proposed revisions for Items; needs with 
respect to Items; overall test quality. 
Develop Instructions for test administration, 
scoring and Interpretation strategy. 

Consultants prepare final test form for mail- 
ing as trial self-assessment Instrument to 
100 PAs. 

Sel f-assessment I nstrument (Trl a I Form) 
ma! led to 100 PAs* 

PAs return completed self-assessment ex- 
amination to Consultant in sel f -addressed 
enve lope. 

Consultants score and I tern analyze self- 
assessment exam. 

Consultants review Item analysis, make 
necessary revisions and prepare report. 

Test Specifications Committee Meeting #2. 
Exam .and report reviewed. Comments and 
further action suggestions recorded. 

June 29 Project Final Report due to HEW Project Officer 



May 17-19 

May 20-21 

May 25 
June 4-13 

June 14-16 
June 17-18 
June 21-23 
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Committee received instructions for writing scenarios and test items, 
sample test items and scenarios, a project Schedule, and the PA Role 
Delineation* They were asked to identify critical incidents related 
to the Role Delineation and to bring these along to the "Workshop". 
The Workshop began with an overview of how test items are written to 
test specifications and revised on the basis bf item analysis* A 
Guide for Itm Writers was prepared with sections on Item Styles, 
Item Editing Principles, Item Revision Principles Based on Item 
Analysis Data, Writing Test Items on Interpersonal Skills, and 
Some Basic Concepts on Bloom's Taxonomy* The Test Specifications 
were discussed and the need to develop scales pointed out. 

The 24 persons attending this IWC Workshop worked in four groups 
and produced four 80-item Test Sections. Each test item was referenced 
to the Role Delineation, to a Scenario, and ultimately to the Test 
Specifications Matrix. 

These four Test Sections were edited and "tried out" on PAs 
attending the AAPA-APAP Conference. This first Try-Out consisted 
of responding to the 80 items, then rating each item for relevance 
to PA practice, and finally indicating any problem words/phrases in 
the test- items. The respondents remained anonymous and no feedback 
was promised other than the Answer Key. 

The responses received for this try-out were computer scored, item 
analyzed, and frequency distributions and other statistics generated. 
The relevance ratings were likewise scored and item analyzed. All this 
data was then summarized and mailed to tha respective group of Item 
^Writers responsible for the Test Section. Written and other comments 
on the test and individual items were also summarized. 
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Members of the Item Writers Committee were urged to use the above 
material in preparing for the second Workshop on May 17-19. Specifically, 
five tasks were identified as homework: completing the Scenarios file, 
verifying the allocation of the items to the test specifications matrix, 
revising items using the item analysis and relevance summary data, develop- 
ing new test items where needed to fulfill the specifications matrix, and 
reviewing of test item options to generate appropriate feedback on error 
patterns of responders. 

At this point, it is necessary to mention that test items were also 
written by some field writers (PAs and PA educators selected by AAPA staff) 
using the Guide for Item Writers and other written materials available at 
this point in the project. Unlike the Item Writers Committee, the field 
writers worked on their own at home. A fifth Test Section of 80 items was 
thus developed. Section 5 was administered to a group of PAs and the 
responses were scored and item analyzed. 

Several PA programs responded to the AAPA call for Test Items. A 
large number of test items was thus accumulated. These items are of 
variable quality and. have not been critiqued nor coded to the Specifica- 
tions Matrix. 

The pace of the second Workshop for Item .Writers was hectic but 
a considerable amount of time was spent in reviewing the feedback capa- 
bilities of the Examination. The Item Writers recommended that it would 
be more meaningful to PAs if additional scales were developed using Body 
Systems and Medical Intervention Type as the basis. This resulted in a 
set of 28 scales. All available test items were classified in terms of 
these 2 new criteria— Body Systems, Medical Intervention Type, as well 
as the original test specifications criteria— Role Areas and Skill Levels. 
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Additionally, three other criteria were utilized for analyzing the test 
items, namely, patient age, medical specialty, and common disease cate- 
gories as identified by the Medical College of Virginia (Marsland et al., 
1976). This effort resulted in a test item bank of about 425 items with 
all items classified by these seven categories* The correct answer for each 
item was also documented in terms of standard medical texts . 

The second Item Writers Workshop resulted in 315 items • These 
were assembled into two Sections, with 160 and 155 items respectively, 
in order to fit a standard Digitek answer sheet. A third Section was 
added to obtain data on the practice profile of the PA taking the 
self-assessment examination, and also to ascertain felt continuing 
education needs in terms of the 28 System Scales. A few additional 
questions were added to get the professional background of the PA and to 
receive evaluative ratings on the project from the PA. 

Early in June, a self-assessment package, consisting of the three Sec- 
tions with appropriate answer sheets and directions for self-administration 
and use of return envelope, was mailed to a random sample of 300 PAs. This 
constituted the second try-out of the Exam. 

As scheduled, the second Test Specifications Committee Workshop was 
held on June 21-23. At this point, usable responses had been received 
from about 100 PAs, the number that had been originally planned for. 
Several tasks remained before these responses could be scored and reported 
on. These were: 1) the verification of the correct response to each test 
item by this independent other Committee, 2) the verification of the 28 
System Scales and of the classification of the test items in terms of 
these scales, 3) the development of appropriate Role Scales and the veri- 
fication of the classification of the test items in terms of these scales, 



15 

4) an independent review of each test item in terms of its quality, 

5) the determination of the scores expected (Nedelsky Method) of a 
minimally competent PA, 6) a review of the four-page Report to be 
computer-generated and provided to each PA taking the self-assessment 
examination, and 7) a list of recommendations for additional work and 
next steps with this project. 

The Test Specifications Matrix was redefined in terms of 17 
Role Scales based upon a regrouping of the 11 sreas in the Role 
Delineation, a revision of the five skill levels, and the introduction 
of Body Systems categories into the Role Delineation, Revisions were 
recommended to about 60 of the items, and several were tagged for 
deletion from the examination. Unfortunately the Committee did not 
have access to the Item Analysis on the Revised Examination at the time 
of its meeting (the responses had barely been received then) .and so the 
recommendations were entirely judgmental. 

Individual comments and extensive reviews of test items have since 
also been received from PAs in the field. The examination will therefore 
need to be thoroughly revised on the basis of all these comments and 
reviews, as well as on the basis of the item analysis data now available. 
An initial review of these data by the Consultant was used to arrive at 
tentative decisions on the scoring key for this try-out reporting. 

The four-page Report had been computerized and an Interpretive Leafle 
prepared to accompany this Report to each of the 108 PAs who participated 
in this Second Try-Out of the examination, A special computer system has 
been developed to generate these Reports and to provide the AAPA with: 
1) the usual measurement quality indices for this examination such as 
reliability, coefficient of agreement and standard error of measurement 
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for each scale, 2) a summary of the scale scores for the total group in 
terms of means, standard deviation, range and frequency distribution,- 
3) a summary of the practice profile scores and continuing education 
needs scores for the tota'l group, and 4) a summary of the evaluative 
feedback provided by the 108 PAs on the self-assessment project. 



D. List of Products Developed/Under Development 



Several products have been generated by the project. Those en- 

cl'osed with this Final Report are indicated by asterisk. Intermediate 

products and by-products are listed but not enclosed. Products that are 

under development are indicated in italics, 

Unprioritized List of Program and Research Objectives 
for the Self-Assessment Examination (Exhibit A) 



i i i 

iv 
v 
vi 

vi i 

vi i i 

ix 

x 
xi 
xi i 

xiii 



The Test Specifications Matrix and its proposed 
Implementation Chart (Figure 2) 

Instructions for writing a Scenario and a sample 
test item generated from a Scenario (Tables 7, 8, 
and 9) 

Guide for Item-Writers 

Five 80-item Test Sections (First Try-Out) 

Item Analysis Results for the five Test Sections with 
usual scores statistics 

Frequency Distributions of the Relevance Ratings 
for the four Test Sections 

Summary of Written Comments on items in the four Test 
Sections 

A Scenarios File listing critical incidents for the 
Role Delineation 

Test Items (Unclassified) 

Test Items File for Classified Items 

The 28 Body System X Medical Intervention Scales 
(Figure 4) 

The 17 Role Area X Body System Scales (Figure 3) 
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xiv 

XV 

* xv i 

* xvii 
*xviii 

* xix 

* xx 
xx i 

* xxii 

xxiii 

xxiv 
xxv 

xxvi 
xxvii 



Correct Answer Documentation File (to be merged 
into Test Items Bank) 

The three Sections of the Self -Assessment 
Package (Second Try-Out) 

Scores expected of Minimally Competent PA (Nedelsky 
Method) by Scale (Exhibits 6, H) 

The Individualized Report (Exhibit J) 

The Interpretive Leaflet (accompanies Individualized 
Report) (Exhibit K) 

Statistical Summary Reports on Scores (Exhibits C, D, % E) 

Summary of Evaluative Ratings of Project (Table 10) 

Computer Scoring and Reporting System 

List of Recommendations for Future Efforts 
(Chapter IV) ' 

Item Analysis Results for the Second Try-Out with 1 
usual scores statistics 

Research studies/papers 

Symposium for presentation at the 1980 Annual Convention 
of the-American Educational Research Association 

The training of a small group o% PAs in the technical 
aspects of item writing, item revision, and test develop- 
ment procedures 

Listing of computer cards documenting characteristics 
of all items in Test Item Bank ! 



E. Project Problems Encountered 

1) The tune crunch. For several reasons, the project did not get 
actively underway until early March 1979. Therefore, the process for 
developing the products to this project had to be compressed. The AAPA 
was fortunate to receive a 45-day extension from HRA so that the essential 
products could be completed as proposed. 
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2) Quality vf Test Items. It was difficult to produce test items 
in larger quantity and better qual ity~ despite the excellent efforts on the 
part of all concerned, because time is needed to train more physician 
assistants in the technical aspects of item writing and item revision. A 
few physician assistants are currently available with such expertise 

but their number is not large enough because the professional's young. The 
item-styles utilized in the examination, the quality of the response op- 
tions,, and the cognitive level of the questions can be improved as more 
time becomes available and expedience is gained. 

3) Technical Problems. Although these will be discussed in greate 
detail in another section of this Report, the project had to contend with 
the current deficiencies in the technical state-of-the-art relative to 

o 

c 

self -assessment methodology, the measurement of professional competency, 
the setting of minimum competency standards, and the development of 
criterion-referenced examinations. Traditional testing, as contrasted 
with self-assessment, uses rigorous test administration procedures. 
Little seems to be known about self-assessment, and even less about why 
and how professionals seek^continuing education. 'Professional competence 
remains a complex set of skills, the jnost critical of which, such as- 
interpersonal and attitudinal skills, are still very difficult to measure 
by multiple-choice examinations. The techniques for settings-minimum 
standards are typically judgmental and are therefore prone to error and 
bias. The situation with criterion-referenced testing is Vike that of 
the tail wagging the dog. The public is. sold on the idea, but the tech- 
nical cupboard is yet bare. The techniques available for the development 
of such examinations are yet on the frontiers of measurement technology 
and therefore not easily available. 

3'J 
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III. TECHNICAL ISSUES , 

4 

A. Objectives of Self-Assessment Examination 

A self-assessment, unlike a self-rating, does not necessitate a 
self-indictment. Ratings seem to have an end-point finality about them 
that influences the manner in which individuals are willing to look at 
themselves. Perry (1977) noted that although physicians are very happy v 
with physician assistants, the validity of self-ratings of performance 
by physician assistants was generally questionable. 1 Futhermore, Kegel- 
Floom's research quoted by Perry indicated that personality character- 
istics substantially bias self-evaluation of performance. 

A self-assessment Is an opportunity for self-improvement without 
any judgmental labels or punitive consequences. It is likely that the 
professional's interests seeking continuing education is> influenced 
by his feelings of inadequacy or his need for better knowledge and 
skills. Self-assessment could bfe an aid to kindle such feelings or * " 
needs. The fact that travelers will test their. IQ during their leisure* 
.times may indicate an innate human curiosity about oneself and augurs 

well for the practical utility of making self-assessment tests 'avail able 

* 

on a voluntary basis. 

Research on self-evaluations of dentists (Milgrom et al., 1978) 
indicated that thejr accuracy increased as they became more specific. 
In other words, professionals are more threatened by global assessments 
and more willing to acknowledge deficiencies in certain specific aspects. 

It is also essential- to emphasize the diagnostic intent of self- 
assessments in order to differentiate them from certifying exams (Engel, 

1976). These differences significantly impact upon the manner of con- 

f 

struct#ng and interpreting such tests. 
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sic . 

The ultimate purpose of the self-assessment examination is to 
enable physician assistants to maintain their competence and thereby 
ensure the quality of health care. A competent professional may be 
defined as one who knows how to do well the job expected of him and is 
able to translate this knowledge into hi ^""practice. Competence includes 
knowledge, application, and attitudinal skills* At a higher-level-, 
application develops into technical problem-solving as well as into 
interpersonal communication skills* As a result, one might define, 
using a combination of Bloom's cognitive system, KrathwohTs affective 
system, and Gagne's learning model, a five-level system of competence 
defined as follows: knowledge, application, technical problem-solving, 
interpersonal communications, and professional attitudes. 

The key to the maintenance of professional competence does not 
lie merely in the providing of appropriate continuing education programs. 
Professionals tend to be busy persons who are not easily convinced of 
the practical utility of attending educational program?* Often such 
programs are not targetted to their immediate needs, or they are incon- 
venient to attend, or they are inappropriately handled by instructors, 
or they remain unknown to the busy professional. 

Recent efforts by the professions to require recertifi cation on 
a regular basis by their membership are based upon/the rise of mal- 
practice suits and a continuing demand by an increasingly better-- 
educated society for good quality care. Professional accountability, 
however, is limited to "what one claims to be able to do and is actually 
engaged in doing. The maintenance of competence is therefore circum- 
scribed by what a professional professes to ,be able to do by virtue of 
his role and by what he actually deals with' in practice. One way of 

41 



looking at professional accountability is via a consumer-provider model 
of professional roles (D'Costa, 1975) shown in Figure 1. There are 
four forces at play: the role expectations of the profession, the role 
expectations of the patient and consumers, and those of the individual 
professional himself^ 

The Self-Assessment Examination of the AAPA was accordingly de- 
signed to encourage a PA to plan his continuing medical education (CME) 
in tarms of these various forces, namely: professional role, practice 
expectations, and individual felt needs- The AAPA sees ?s its role the 
development of appropriate self-assessment tools, the facilitation of 

r 

such planning, and the providing of worthwhile continuing education 
programs on an efficient basis. 

In assessing this AAPA role at the first Technical Specifications 
Committee Workshop, distinction was made between day-to-day programnatic 
goals and technical /research goals of a self-assessment examination 
(Exhibit A). The five most important program-related goals were identi- 
fied as: « 

1) Develop a national profile of PA-CME needs; 

2) Ensure that the self-assessment was not narrowly 
conceived as an aid to recert-if tcation but rather 
as an aid to the maintenance of professional com- 
petence and quality of practice; * 

3) Recognize that, since the primary purpose is to 
help the PA plan his continuing medical education, 
the self-assessment program should stress suffi- 
cient feedback to the PA; 

4) Recognize that the present paper and pencil exam- 
V ' ination may not encompass all aspects of clinical 

competence; and 

5) Recognize that this self-assessment is geared to 

. individual needs and therefore may not bedirecti/ 
useful to evaluate PA training programs. 
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FIGURE 1 

The Four Types of Expectations of Health Professionals 

/ 




Codes 

P = Providers of 

health care/profession 

— > = Process 
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C = Consumer of 

health care/society 

pt = Patient 

| | = Structure 
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At the. technical/research level, it was recognized that the state- 
of-the-art is far from adequate. Accordingly the following were identi- 
fied as the five most important technical goals: 

1) Define the core/critical skills, behaviors, and 
knowledge expected of entry-level general ist PA 
professionals; 

2) Study the relationship between competence and 
tasks frequently done; 

3) Study the relationships between self-expressed 
competence and test-derived competence; 

4) Identify strengths and weaknesses of PAs in 
terms of training program, geographic location, 
and practice specialty; and 

, 5) Identify causal dimensions of professional 
performance. 

On second thought, the Committee decided that while these goals 
were good to maintain for perspective purposes, the major efforts of 
this project should focus on the development of the examination and on 
the setting up of a self-assessment model with emphasis on feedback. * 

- t 

B. Technical Rationale of Test 

The original HRA contract called for several self-assessment 
tools each with a. correlated individual independent study package, As 
discussions between the AAPA and HRA continued it became evident that 
it would be too early to embark upon such a massive program. Accordingly, 
the contract was modified to specify the development of a single 
criterion-referenced self-assessment tool. Not only was this goal 
reasonable in the circumstances, but it also provided opportunity for 
the development of the necessary technical* framework upon which a system 
of self-assessment tools could be developed in the future. Merely 
generating several self-assessment tools might have been disastrous. 
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The main reason for the above line of thinking lies in a basic 
principle of criterion-referenced testing. Popham (1978) states that 
such tests are designed to ascertain an individual's status with respect 
to a well-defined behavior domain. The precise definition of the domain 
in terms of skills is critical to the concept of criterion-referenced 
testing because of the need to make generalizations about the mastery 
or non-mastery of these skills based on test scores* 

C. Test Specifications 

Hambleton and Eignor (1979) provide a 12-step process for develop- 
ing and validating criterion-referenced tests. Unfortunately, their em- 
phasis is on objectives and the .specification of item formats and number 
rather than on the crucial matter of domain definition advocated by 
Millman (1974) . Merely listing objectives related to criteria becomes 
an atomistic approach that is limited in meaningfulness and relevance 
when it comes to interpretation or self-assessment. To this project the 
quality of self-assessment is paramount and for this reason the matter 
of domain definition becomes very important (Pottinger, 1977). The 
criteria or objectives must be linked to the main domain and the link- 
ages must be clear. Only then will a PA recognize the implications of 
his weakness in some skills in relation to his overall performance as a 
PA. 

The domain of this self-assessment examination is the performance 
expected of a minimally competent general ist PA. Wilson (1976) argues 
that competency assurance in a credential ing program "must be based on 
a sound generic position classification". Fortunately, such an analysis 
has been completed and verified in the case of the PA profession by the 
AAPA. Indeed this project is an integral component of that major effort. 
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The i;ole delineation of a profession describes the tasks which a 

l 

practitioner must be competent to perform. This contrasts' with other 
approaches, such as task inventories, which list all tasks that a prac- 
titioner can, should, or might perform. The role delineation thus pro- 
vides a position classification and is a minimum standard expected of 
all practitioners in the profession. A role delineation is expressed in 
terms of performance responsibilities rather than just knowledge expected. 

The 1979 version of the Role Delineation for the PA (see Volume II) 
lists 11 major areas of responsibility (Exhibit B). Each area is exten- 
sively defined in terms of specific responsibilities. Together,' the major > 
and specific responsibilities- define the domain of this self-assessment 
exam. . / 

The structure of the domain^ was initially recognized as the 11 
areas of major responsibility. The specific responsibilities under each 
area were also recognized for purposes of definition and item generation, 
thereby ensuring fidelity to the meaning assigned toveach role area in 
the Role Delineation. However, their number was considered too ^numerous 
to include in a test specifications matrix. The 11 areas of respon- 
sibility served as the major content areas defined along one dimension 
of a specifications matrix. It is customary to identify skills levels 
as the other dimension. Typically, Bloom's taxonomy (1956) of cognitive 
skills has served as this dimension. In the present situation a 
five-level scheme generated as follows: knowledge, application, 
problem-solving, interpersonal skills, and professional attitudes. 

A 11 X 5 matrix thus served as the initial test specifications 
matrix. It was recognized that the 55 cells in this matrix were too 
many to utilize as scales for feedback purposes. Concern was expressed 
by the Test Specifications Committee about the reliability of test items 
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related to role areas, such as: recognize interdependent relationship, 
demonstrate professional behavior, promote acceptance of PA role, 
and maintain competency. It was also noted that measurement teghniques 
available for attitudinal and interpersonal skills are not of the usual 
paper-and-pencil type. 

It was the intent of this project to develop a test specifications 
matrix that represented the ideal expectations of a self-assessment exam- 
ination and to use this as the target during the item development process 
However, in the implementation of this project, this was foundhdiff icul t 
to implement and a compromise procedure wjts Utilized, The Committee 
members begaq by assigning ideal weights for the specifications matrix 
on an individual basis, but later during the group discussion process 
they negotiated compromise weights with each other using current 
measurement realities as their basis. Table 2 presents the weights 
(slfown within boxes) arrived at by the Committee for the row and column 
totals or matrix marginals. Low weights were assigned to Areas 1, 2, 10, 
and 11 and to Skill 5 even though these weights di J not reflect their 
importance to competent performance. The weights assume that the total 
number of test items would be 300, 

The derivation of individual cell weights was initially done 
mathematically, using an expected frequency computational approach. 
Table 2 reflects such expected values. However, an actual specifications 
matrix does not need to have each cell weight proportionate to its respec 
fcive marginals (row and column totals). Instead some cells can be left 
blank and others enhanced (in order to reflect the real world) without 
violating the marginals. Such a refinement of the Test Specifications 
Matrix is presented in Figure 2, 
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TABLE 2/ 

TEST SPECIFICATIONS MATRIX CELLS AND MARGINALS 
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FIGURE 2 



IMPLEMENTING THE TEST SPECIFICATIONS MATRIX 
USING TS COMMITTEE MARGINALS 
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The advantage with this scheme in implementing test specifications 
lies in its feedback capabilities. Instead of 55 cells some of which would 
have very few test items allocated, it now becomes possible to' cluster 
test litems around relevant N cel ls/scales. Note, too, that the number of 
scales can be reduced to a manageable number , 

The boxes in Figure 2 represent six items each. This was done 
because research by Eignor and Hambleton (1979) on effects of test length 
on selected test score reliability and validity indices indicated that, 
depending upon the domain characteristics and the decision-making strategy 

used, even tests with as few as 6 items could be effective in criterion 

✓ 

referenced testing. It is assumed that two or more boxes can be com- 
bined to form a single scale wherever appropriate. However, the potential 
for creatine, more than one scale within each cell permits other criteria 
to be recognized thereby acknowledging the multidimensional ity of the 
test domain. Note that this implementation of the Specification Matrix 
does not change the originally prescribed matrix marginals. The items 
represented by the total number of boxes add up to the row and column 
marginals/totals. 

The allocation of the boxes (potential scales) to the cells in 
the matrix was done so as to make optimum practical sense given the nature 
of the role responsibilities, the level of skills required, and the re- 
commendations of the Test Specifications Committee, 

D. Scales definition 

The theoretical or a priori derivation of scales assumes that 
enough is known about the real world of PA competence by the Specifica- 
tions Committee, This was not the case in this project primarily because 
the profession is young and little data is currently available,- For 
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these reasons a "successive approximations" strategy was employed, This 
issue of scales definition was taken up at every Workshop of the two 
Committees and it was not until the final Workshop that the scales were 
finalized for this project's purposes. In keeping with this strategy 
it can be expected that the future will see additional modifications to 
the two sets of scales currently defined by the project for the examina- 
tion. 

The two sets of scales are named: "Hole" and "Body System" (see 
Figures 3 and 4). In actuality each set of scales ;s defined by a matrix 
with two criteria. The 17 Role scales are defined by the cells of a 
matrix obtained from the 11 role areas and 13"body systems. The 28 Body 
System scales are defined from the matrix defined by 13 body systems and 
four medical intervention types. Table 3 provides broad descriptions of 
the Role Area Scales. 

N Several questions arise at this point. How was "Body Systems" 
selected as a criterion? Why "medical intervention type" and why not 
some other criterion such as "patient type"? Why is "Body System" 
utilized a second time with the 11 "Role areas"? What happened to the 
five "Skills levels"? What types of criteria were considered before these 
decisions were made? These considerations are critical to an under- ° 
standing of the test domain defined and to an appreciation of the prob- 
lems inherent in developing useful test specifications for a self- 
assessment examination. 

To begin with, it was understood that the starting point for the 
test would be the Role Delineation for PAs. Considerable effort had gone 
into the development of the Role Delineation and into its verification. 
It was also recognized that the 11 Role Areas, although judged critical to 
PA competence, may not serve as the best feedback mechanisms for continuing 
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MAP OF THE 17 ROLE AREAS X BODY SYSTEM SCALES 
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FIGURE 4 



MAP OF THE 28 BODY SYSTEM x MEDICAL INTERVENTION SCALES 



o 
c 

0) 

p- 

Q) 

E 



Q) 

c 

O TO 

<5 



o 
E 
o 

i= 
O 



V 



1. Musculo-Skeletal 

2. Dermatology 

3. Endocrjne 

4. Eyes & ENT 

5. Respiratory 

6. Card io- Vascular 

7. Hematology 

8. Gastro- Intestinal 

9. Genito- Urinary 

10. Reproductive 

11. Neurology 

12. Psycho -Social 

1a Other (Pharmacy) 



S 



12 



13 



14 



8 



15 



^0 



16 



11 



18 



23 



•25 



19 



17 



26 



20 



21 



27 



22 



28 



ERIC 



5;j 



33 



. TABLE 3 
DESCRIPTION OF THE ROLE AREA SCALES 

Sea le 



Professional Role * Recognize Interdependent relation- 
ship with supervising physician 
Maintain competency 
Promote acceptance of the role 

Items on thfs scale are related to 
understanding the PA role, working 
within the role, maintaining compe- 
tency as a PA, explaining the PA 
role to others, and displaying ap- 
propriate PA behaviors*. 



r 



Interpersonal Behavior Demonstrate professional behavior 
2 . Establish effective Interpersonal 

relationships with patients, pro- 
fessionals, and others 

•Items describe behavior whlcji In- 
volves Interactions with others, 
especially to demonstrate concern, 
respect} and empathy with the other. 



Gather Data 
3-5 



Ana t yze Data 
6-10 



Manage Patients 
13-16 



Establish health status data base 

Items Remonstrate basic knowledge 
essential to the data gathering pro- 
cess. I.e., the PA knows what Infor- 
mation to collect and which diagnoses 
are possible, given certain Informa- 
tion. 

Analyze health status data base 

These Items demonstrate the use of 
knowledge in t/he decision-making 
process. I.e., the PA. can Interpret 
data* f rom .1 abo^atory tests, history, 
and physical exafrol.nat I on to lead to 
a working diagnosis. 

Formulate health management plan 
Implement hea I th. management plan 
Monitor health, management plan 

Items demonstrate whether, for a 
diagnosed problem, the PA can de~ 
velop a p-fan of* action, carry out 
the plan, and/or monitor progress In 
order to make any necessary modifi- 
cations In the plan. 



Note: ScaTesll* 12, and 17 combine two or more descriptions 
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education purposes. For example, being told that one ^deficient in 
data gathering skills may be too global a diagnosis in terms of making 
a meaningful remedial prescription understandable to a PA. The Item 
Writers Committee was particularly sensitive to this dilemma and urged 
consideration of other criteria, especially Body Systems, on an additional 
basis. The Test Specifications Committee was sensitive- to this problem 
too, and had recommended that other criteria, such as patient; type, 
medical intervention type, body systems , medical specialty, common / . 
patient presenting symptoms (MCV Disease) be also considered when 
developing test items. The intent on their part was representation 
of clinical practice. Table 4 presents the levels for each of the seven 
classification variables. 

It was the judgment of the two Committees that Body Systems 
represented the most useful criterion to use in scale development for 
several reasons: 1) most text books are organized by body systems, 
2) body systemr provide a better reference approach in studying 
patient problems, 3) medical specialty is not useful to physician 
assistants because of the profession's emphasis on primary care, 
4) the most commonly presented patient symptoms (MCV Disease Categories, 
Marsland et al . , 1976) are not comprehensive and are inconvenient 
because there are too many categories. 

"Medical intervention" was selected over "patient characteristics" 
because it provides a well-defined classification scheme in patient care. 
Emergency gare is now well recognized as a class by itself and health 
maintenance is fast emerging as a new thrust of societal interest. 

The five skills areas were very much in the minds of item writers 
when developing the test. However, the number of levels was dropped 
from five to three in order to simplify the task for item writers. Note the 
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CODES/LEVELS FOR CLASSIFICATION VARIABLES 
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special instructions (Table 5) to help item writers recognize these 
three levels of skills. Furthermore, test items in each Test 
Section were classified in terms of the original Test Specifications 
Matrix as shown in Table 6. This three level approach to skills is the 
one that hits been adopted for all test items in the Item Bank now develop- 
ing with AAPA. The concern for representing all three types of skills in 
each scale of the test persisted throughout the project. It was recog- 
nized that interpersonal skills are the most difficult to measure, and 

that most test items tend to become of the knowledge level. 

v ** 
E. Item Generation 

The major test development approach utilized in this project was 

derived from the critical incidents technique first proposed by Flanagan 

(1954). Given the 55 cells defined by the 11 role areas and the five skills 

levels, item writers were, asked to identify critical incidents for each 

cell. Furthermore, the item writers were asked to utilize their experience 

to describe the critical role of the PA in the incident (patient scenario) 

in terms of skills needed and errors likely. As critical incidents were 

identified and the needed major skills and typical errors noted, items 

began to be written and situational details added on. Discussions 

ensued within each group as to how typica' a given scenario was in PA 

practice and changes were accordingly made. This approach to item 

generation used in conjunction with the Role Delineation is unique in 

that it maximizes a concern for the critical characteristics of job 

performers rather than merely considering critical dimensions of the 

job (Pottinger, 1977). Table 7 presents the instructions to Item 

Writers. Table 8 presents the Critical Incident/Scenario developed by 

one item writer for Role Area 4. Table 9 presents a Test Item 

generated from this Scenario. Note that the Guide for Generating 
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TABLE 5 



How to Classify Test Items by Skills Level 1 



In classifying each item, three skills levels were used: 

knowledge 
problem solving 
"interpersonal skiTls 

1, Knowledge refers to any item requiring factual recall of information. 
This was used in cases where a diagnosis (as uncomplicated trichomona! 
vaginitis) or condition (as dark urine) was identified and specific 
treatment procedures, data gathering techniques, or potential -causes 
were requested , The key element in these items is that the examinee 
is given a clearly identified and limited context in which to provide 
specific information (lab procedures for vaginal discharge are,,,; 
conditions causing asthmatic symptoms in the pediatric age group 
include, , , ) 

2, Problem solving refers to any item involving two steps. First, the 
examinee must analyze and order the information provided in a problem 
situation (logical thinking). In this first step, the examinee infers 
what the problem really is. Second, the examinee both recalls and 
applies previous knowledge and experience in determining appropriate 
courses of action. The category problem solving was used primarily 

in those items describing a patient with signs, symptoms and/or 
presenting complaint. These it^ms usually required both identifi- 
cation of the problem and determination of appropriate actions, 

3, Interpersonal skills refers to those items clearly requiring the 
use of effective buman relations skills. 



x This classification strategy was prepared by Cherry Turner, 
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TABLE 6 

ASSIGNMENT OF TEST SECTION 1 ITEMS TO SPECIFICATIONS MATRIX 
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VI. FORMULATE PLAN 
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37, .41 , 74, 77 




VII. IMPLEMENT PLAN 


51 


1 , 5, 14, 38, 67 


1 8 


/III. MONITOR PLAN 


20, 75 


16, 23, 33, 44, 73 
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IX. INTERPERSONAL 
RELATIONSHIP 






IG, 66 


X. COMPETENCY 


1 1 






XI . ROLE ACCEPTANCE 







21 , 36 
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TABLE 7 

GUIDE FOR GENERATING TEST-ITEM SCENARIOS 1 



I . Study the Roll delineation Model to Identi-iy/Tkink oi 
Critical Incidents 

Note: A critical Incident Is defined as a set of be- 
haviors that characterize either effective or 
Ineffective performance. Identifying these ex- 
tremes of a performance dimension In terms of 
critical incidents helps to understand and to 
define the pe rf or ma nee dimension for measure- 
ment purposes. 

1.1 Pick an item from the Model. Start with #IA. 
IRole Area I, Sub-area A) Accept that the 
role of the PA Is limited by supervising phy- 
sician, legal limitations, etc. 

1.2 Think of a* s I tuat I ory/ 1 nc I dent J n which a PA 
very effectively accepted his/her role limita- 
tions ♦ 



Think of a s I tuat I on/ I nc I dent in which a PA 

harvdled his/her role limitations very ineffectively . 

I I . incite a Test-Item Scenario fc* this PA Critical Incident 

2.' Vescfiibe generally what happened during the 
I ncl dent. 

2.2 Define the conditions in which this incident 
u nfo I ds : 

2.21 The location/setting (hospital, office, 
etc.) 

2.22 The other health professionals involved 

2.23 The type of patient involved Csex, age, 
soc loeconomlc status, disposition, clinical 
condition, etc. ) 

2.24 The type o.f health care situation Involved 
(.prevent! ve, remedial, rehabilitative, 
etc.) 

% 

2.3 List the majofi Skills that the PA needs to 
handle this situation effectively. 

2.4 List some typical eKKOKS, mis-cues, slip-ups 
that a PA might succumb to In this situation. 

2.5 List some remedial learning prescriptions that 
you would recommend in the case of each error. 



l 



Prepared by Ayres D' Costa 
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AAPA 

Self-Assessment 
Exam 



TABLE 8 



Prepared by: 



SAMPLE SCENARIO 



DATE: 



Role Delineation 'Model Code: IV 



Critical Incident Description; 

Patient is a 11^ month old black male child living, in a small community, 
in the northern Midwest. Parents are of lower socioeconomic status. There 
are two older siblings (ages 1^ and 3 years), and the mother; who is four to 
five months pregnant, is on welfare; there is no father in the home. The 
child has been brought to a family practice office for a routine one year 
old checkup. 

The child's weight is 17 pounds, length 27 inches. During the course 
of the physicial examination you note that- his legs are "bowed" with external 
rotation of knees and internal rotation of the feet. You can elicit full 
range of motion. The physical exam was otherwise within normal limits. 
Through a more extensive history, you note that child is on breast milk with' 
the only supplement being orange juice; he eats no solid foods other than 
baby. cereal and crackers. The mother reports that the child does not crawl 
and makes little attempt to "scoot. Mother reveals she is unhappy about 
her present pregnancy. She feels hassled arid tired and, although is very 
emotionally caring about her children* feeU that her burdens are almost 
too ijreat to handle. 

Conditions : 

^JHsease category - Musculoskeletal 
Patient age - Pediatric 
Patient sex - Male 



Skills Needed : 

- Complete nutritional history and social history 

- Complete physical exam including hips and extremeties 

- Order x-rays of all extremeties and chest 



Errors Most Likely : 

- Incomplete history 

- Limited physical exam 

- Inappropriate lab analysis 

- Misdiagnosis (i.e., no labs ordered)', therefore no treatment 
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TABLE 9 

AAPA • TEST ITEM * . Pre P ared b V ; 

Self-Assessment t _ 

r t Date;- 

Exam 

/ « * 

Role Delineation Model Code: IV 
Item #: I ( Correct Response; C 

An Mi year'old black male child is Seen in your 
office for routine physical exam (one year oid check). Du 
ing exam you note external rotation of knees and internal 
rotation of feet. You can elicit full range of mot ion, 
and hips are normal. Otherwise, the physical exam is 
within normal limits. Your next step should be to; 

A. , Determine that he has tibial torsion and 

prescribe ort hope die shoes. 

B. Refer to orthopedics for tibial tqrs i on . 

C. Obtain radiologic diagnosis to confirm your 
tentative diagnosis of Ricketts. • ^ 

D. Refer to supervising physician because you 
' cannot decide what problem exists. 

E. Explain to mother that many children have 
"bowed" legs and that he will grow out of 
this. 
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Scenarios bypasses the jusual development of behavioral objectives and 
amplified objectives as recommended by Popham (1978). Instead, the item 
writer moves directly to the identification of a critical incident re- 
lated to the test domain .when the PA either functioned very effectively 
or very ineffectively. The second step involves the usual amplification 
process .(conditions, skills, errors), but it is modified so as to obtain 
material needed to construct item distractors meaningfully. Linkages 
are also established with the remedial learning prescriptions. 

Three major types of item stimuli' were proposed to the item 
writers: patient, conditions/problems scenario, scientific graphic/tabular 
data/reading passage, and the regular iPultiple-choice item. These consti- 
tute three basic types of stimul i--peopie/si tuation encounter, data/report/ 
graphic presentation and the direct verbal question. Each stimulus type 
has its own peculiar challenges, although the people-type tends to, be 
more unbounded and therefore more complex and challenging. Data and 
graphic stimuli require specific scientific sophistication and skill, 
although they can be more straightforward and clearcut. The verbal 
type of multiple-choice item .is the typical examination test item 
where terminology is important. 

Various items styles are associated with' each of the above major 
types of item stimuli, such as classification, relationship or varia- 
tion analysis, trend/sequence analysis, true-false, five choice comple- 
tion, five or four choice association, excluded term, quantitative * 
comparison, and multiple completion (K-type). These and other styles 
were illustrated 'in the Guide for Item Writers thereby suggesting ways 
to assess different types of cognitive skills. Special efforts were 
also made to identify multiple-choice strategies to get at interpersonal 
skills and professional attitudes, e.g., by the use of situations, 
dilemmas, and best answer items. 
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The objective of itfyn sampling was not to represent all skills 
but just tHose essential behaviors at the terminal level (principle of 
subsumption) . Thus the unnecessary testing of intermediate behaviors 
was to be avoided in favor of srgnificant general izable skills with 
transfer value. Yet the intent of the test was diagnosis and for this ' 
reason the test items could not be extremely difficult or representative 
of above average/ excel lent performance. The test must represent all 
entry-level generalist skills in order to represent minimum competence of 
the^PA. Finally, the items must be stratified so as to represent the 
domain of interest, and random within each stratum in order to be rep- 
1 icable. 

Emphasis must also be placed on the proper development of useful 
response options. It was expected that the typical errors identified 
for each scenario would lead to the construction of appropriate options. 
Some of the more common error patterns are: not utilizing all the^data 
provided in the scenario; misinterpreting a technical term; sex-related 
bias; missing, a significant cue; making computational errors including 
transposing numbers or misplacing the decimal point; using affect-based., 
problem-solving rather than a methodical, logical approach; and inability 
to handle scientific data correctly. 

The documentation of the correct response to an item must be of 
concern to the test developer. Attempts must be made to validate the 
correct answer by reference to a standard text, as well as through the 
process of peer review. Items written by one group of items-writer? 
were reviewed by another group of item-writers. All test items were 1 
critiqued in the two try-outs by PAs and by the members of the two 
Committees. Such critiques point out difficult and esoteric words 
that creep into items depending on the background^and experience 



of the author. Items that have obnoxious terms or socially undesirable 
ideas must be modified. Finally there is need to edit items for format 
awkwardness or inconsistencies, for spelling and grammatical errors, 
and for technical inaccuracies or omissions. 

In the case of critferion-referenced tests, there are two sqme- 
what unique item reviews that nave to occur, typically by an impartial 
group of experts: first, a review of the assignment of the item to the 
specific scale(s)' in the specifications matrix. This is a matter of 
content and construct validity and is critical to tKe generalization 
expected in the score, interpretation process. Second, a review of the 
options in each item to identify the correct response option and to 
identify those options that would be quickly rejected by a minimally 
competent PA. This latter process is part of the Nedelsky Technique 
(1954) designed to compute an absolute minimum competency score, 
Nedelsky believed that a group of judges could make such decisions 
reasonably consistently and thus come up with a dependable minimum 
competency score. If this is the case, he reasoned that the item is 
likely to have significant theoretical meaning and the error options 
then also become educationally significant,' 

These logical deductions by Nedelsky are pertinent to the con- 
struction of a self-assessment examination. It is therefore hoped 
that the scores derived from the response data will substantiate the 
true proficiency level of a PA and identify the prevailing error 
patterns among persons taking the examination. 

The item generation process in this project bas been very hectic 
and dependent upon physician assistants most of whom did not have 
prior experience in test development. Yet the output of some 425 test 
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items, of which 315 were considered reasonably worthwhile to include 
in the Second Try-Out, is gratifying. Each item is being "banked" 
in an Item File so that a record of its development is maintained, 
A sample "item" is depicted in Figure 5, A file of comments and 
suggested revisions to items\js .also being maintained. Wherever 
appropriate, revisions are being recorded in the Item File, 

Each test i~em is identified by an Item Documentation Card on 
which are recorded the seven classif icettion 'criteria for that item as 
indicated in Table 4. Additionally, this computer card indicates the 
numbers of the two sets of Scales to .which the item has been assigned, 
the correct response key, its location in the two Tfy-Out Tests, and any 
significant recommendations for its future revision/deletion, A sample 
listing of these cards is presented in Figure 6, The Cards will even- 
tually include the minimum competency score as derived from the 
Nede'lsky Technique. " - 

It is possible to derive a Scoring Key for sny scale or for" the 
total test with the help of these Item Documentation Cards and a simple 
computer program* It is planned to use these Cards as a simple Item 
Retrieval System so that items of .any desired characteristics can be ■ 
selected using aa- IBM Sorter* The cards then direct one to the Item 
File from which a hard copy can be Xeroxed. Obviously this system is 
not exotic, but we believe it is reasonably flexible and it is in- 
expensive to maintain* * 

F. Test Item Revision 

Two try-outs have been conducted of the test- items generated 
for the self-assessment examination. A standard item-analysis program 
was utilized to generate information about the quality of test-items 
and facilitate their review and revision. The Second Try-Out was based 
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FI3UHE 5 from Item Bank 

flMDI C TCCT TTCM AllthOr: 



^.Assess^ent^LE TEST ITEM J™?' 



Role Area: 5 Patient Age: 3 Body System: 8 

Skill Level: . 2 MCV Disease: % 8 Med. Intervention: 

Scale #: 9 m - Med. Specialty: 1 Scale #: - 4 

Text/Reference: H arvey, 

P. 611 

Items 47, 48 

A 22 year old white male college student comes to your office 
complaining of anorexia and fatigue of recent onset. He has been 
cramming for finals and jogging to obtain relief from tension. In 
addition, he has developed wandering joint pains.. Further history 
reveals that he is a homosexual and has been treated for gonorrhea in 
.the past. He denies recent sore throat, urethral discharge and changes 
in urine or stool .color. He also feels that he is sick an<l tired of 
smoking and would like some help with that in addition to being treated 
for 'his current problem. 

- 47. t Your initial differential diagnosis includes all of the following 
(127) except ; 

A. preicteric hepatitis 

B. infectious mononucleosis 

C. asymptomatic rectal gonorrhea 

D. disseminated gonococcemla 
£. depression 

47 



TOTAL CORRECT REL DIFF 




1 
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3 


4 


5 


BLNK 


N- 5 PCT-20.0 .800 


UPPER 


1 


1 


4 




0 


0 


CORR PHI— .482 (SIO .20) 


(PCT) 


(14) 


(1*) 


(57) 


(14) 


( 0) 


( 0) 


RPBIS—.443 (ITEM-TOTAL) 


LOWER 


1 * 


0 


1 


3*** 


1 


1 


DISCRIMINATION INDICES 


(PCT) 


(14) 


( 0) 


(14) 


(43) 


(14) 


(14) 


OBTAINED D- -28.6 


.TOTAL 


3 


3 


9 


5*** 


4 


1 




(PCT) 


(12) 


(12) 


(36) 


(20) 


(16) 


( 4) 



Items 66-67 

A 22 year old white male college student comes to your office complaining 
of anorexia and fatigue of recent onset. He has been cramming for finals 
and jogging to obtain relief from tension. Further history reveals that^ 
he is a homosexual and has been treated for gonorrhea in the past. He 
denies recent sore throat, urethral discharge, and changes in urine or 
stool color. He also feels that he is sick and tired of smoking and 
would like some help with that in addition to being treated for his 
current problem. 
<.» 

B£7 66. Your Initial differential diagnosis includes all of the following 
except ; ' t 

A. 'preicteric hepatitis 

B. infectious mononucleosis 

C. asymptomatic recral gonorrhea * 

D. psychosomatic symptomatology 

E. depression 
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OBTAINED D« 14.5 


TOTAL 
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5 
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7) 
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(50) 


(26) 
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FIGURE 6 

TEST ITEM DOCUMENTATION CARD LISTING 
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upon revisions done on the basis of item analysis of the First Try-Qut. 
The item-analysis data from the Second Try-Out'has only been examined 
cursorily in order to report item quality indices in this Report. We 
plan to utilize this item-analysis data to revise the items in the 
Second Try-Out. 

•A startdard Item-Analysis Package was utilized and provided the 

following types of information: 

i) Test Score-4i§tribution, including raw score, 
freauency distribution, cumulative frequency, 
percentile rank, and standard Scores; 

ii) Summary Statistics, including mean, median, 
mode, standard deviation,* skewness and 
kurtosis; 

iii) Item analysis, including distribution of 
responses of upper and lower 27 percent, 
difficulty and discrimination indices; and' 

iv) Test quality indices, including reliability, 
standard error of measurement, distribution 
of item difficulty and discrimination indices 
for entire test. 

The following discussion is presented in order to explain how 
the item analysis information was utilized in the development of the 
self-assessment examination. > 

It is good to see the test Score distribution to get a feeling 
for the range and clustering of test scores. The measures of central 
tendency, dispersion, skewness, and kurtosis are also important to 
decide whether there is a preponderance of masters (hopefully) or non- 
masters in the profession. Obviously this assumes that a minimum com- 
petency score fias been decided upon and can be superimposed on this data. 

The item analysis strategy of comparing the upper 27 percent with 
the lower 27 percent is also valid because it provides a comparison of 
extreme non-masters and masters. Items with negative discrimination should 
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clearly be avoided. However, there is no reason to select only test Hems 
))ith high discrimination. It must be recognized that in a mastery-non- 
mastery testing situation, the itei.,s are not chosen for their power of 
separating individuals, but Yather because they serve as representatives 
of critically important responsibilities of the profession. Items must 
therefore be selected from a narrower range of discrimination, but 
they must discriminate between masters and non-masters. 

One way of implementing this using the regular item-analysis is 
to compare the lower 27 percent with the remaining 73 percent. Propor- 
tionately more of the lower group should choose the wrong options 
(distractors) than the remaining group. Note that this assumes that 
in a typical profession, about 70 to 80 percent should be reasonably 
competent persons, unless there is something seriously wrong with its 
certification and training process. 

Item analysis also provides valuable information about the power ' 
of the distractors. Are they serving their function? The fact that 
certain options are not being selected may indicate that these particular 
errors /weaknesses have been well-attended to in previous training. Such 
options should not be excluded. The critical criterion for retaining 
response options must be relevance to professional skills and pitfalls. 

Next, there is the question about the difficulty index or its 
converse, the number of persons who get the item right. Theoretically, 
the items should be selected with reference to professional competency 
relevance, rather than difficulty. It is expected that most normal 
professions would find a large percentage, say 70 or even 80 percent of 
their membership competent. Therefore, the items in such a test would 
appear easy. 
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Thus, we would argue that the three Ds of test construction- 
discrimination, distractors, diff iculty--are also of importance, albeit 
in a* very different way, in criterion-referenced test development* As 
with those who ignore the lessons c f history, those who choose to ignore 
such data would stand condemned by them* 

In this project, the item developers were provided a simplified 
summary report, based on the item analysis data for each test item* 
The three Ds were presented by using certain codes. Also summarized 
for each item was a code which indicated i1 the item was questioned in 
terms of its relevance to the PA profession by those PAs who took the 
First Try-Out examination. It will be remembered that each item was 
separately rated for relevance on a five-point scale. These ratings were 
summarized across raters. 

Finally, we have the overall test quality indices, especially 
reliability and standard error of measurement. Standard texts on 
criterion-referenced testing warn against the use of the traditional 
methods of computing reliability, such as the correlation coefficient. 
Suspect also are measures such as Kuder-Richardson Formula 20 and Hoyt's 
Index. The main reason for this warning is because of the deliberate 
reduction in variance that occurs in criterion-referenced tests. Other 
indices are therefore proposed such as "Kappa" (Cohen, 1960) and "co- 
efficient of agreement" (Subkoviak, 1976). Hambleton and Eignor (1979) 
suggest the Subkoviak approach when a test is only administered once. 
Accordingly, a computer program was written to compute this coefficient 
for the total test, for each cf the 17 Role scales, and for each of 
the 28 System scales of the Self-Assessment Examination. 
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The "coefficient of agreement" was originally defined as the 
probability that each individual in a group will be consistently class- 
ified as a "master" or "non-master" on two successive parallel tests* 
However, like the Kuder-Richardson Formula 20, it is possible to estimate 
this index (coefficient of agreement) from a single administration of the 
test by using the assumption that all items are equally difficult or 
reasonably so* The Hpcision to classify as a "master" is based on 
achieving the minimum competency score identified through the Nedelsky 
Technique by PA experts. 

The coefficient of agreement is similar to "Kappa" and must be 
interpreted like any probability value with a range from 0 to 1 . "Kappa" 
is an index of reliability appropriate for criterion-referenced tests. 

The standard error of measurement is a critical index to present . 
in any test development effort. Hambleton and Eignor (1979) explain that 
this index is valid for criterion-referehced-testing as well. 

G. Test Development Statistics 

The data gathered as a result of the various analyses conducted 

in the development of this test are far too voluminous to present in 

this .Report. Instead the following selected summary Tables will be 

presented as Exhibits without discussion: 

i) Frequency distribution of Total Test Scores 
(Exhibit C); 

\ 

ii) Means, standard deviations, and Minimum Com- 
petency Score for the 17 Role Scales (Exhibit D); 

iii) Means, standard deviations, and Minimum Com- 
petency Score for the 28 System Scales (Exhibit E); 

iv) Distributions of Item Difficulty and Discrimi- 
nation indices for Total Test (Exhibit F); 
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v) Reliability, Coefficient of Agreement, and SEM 
for Total Test and for the 17 Role Scales 
(Exhibit 6); 

vi) Reliability, Coefficient of Agreement, and SEM 
for the 28 System Scales (Exhibit H); and 

vii) Assignment of Second Try-Out Test Items to 

original Test Specifications Matrix (Exhibit I). 

H. Test Interpretation 

"The most obvious benefit of self-assessment", wrote Hess and 
Morrean (1976), "is that with minimum personal consequences physicians 
can readily ascertain what they know (or do not know) in given areas- 
areas where their future decision-making may have profound consequences*" 
This self-assessment examination for physician assistants requires 
about six hours of personal time investment, but the payoff could con- 
sist of good information in several dozen areas (scales)--information 
that could assure competent professional care. To be useful, informa- 
tion must be reliable, valid, complete, relevant and usable. 

The matter of reliability has already been dealt with. It concerns 
consistency in making judgments from one occasion to another. Judgments 
must agree with one another if they are to be reliable. The coefficient 
of agreement provide data on this matter. 

We are concerned here with the proper utilization of the results of 
this examination and the issue of validity becomes central. Typically, 
criterion-referenced testing has been limited to content validity based 
upon expert judgment. The process of scale-development takes us beyond 
this to construct validity where our interest lies in the underlying 
scale area. The process to-date has utilized largely the judgment of 
expert Committees to assure us that the items do indeed represent and 
will therefore predict the scale area. The discrimination index computed 
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in the item analysis provides some validity insights as well. If a test 
item is a valid measure of a construct, then persons who do well on the 
overall scale should get that particular item correct as well. Although 
this reasoning is somewhat circular it is nevertheless useful. 

We do not yet have data to report that would assure us of the pre- 
dictive power of the test. Our interest is largely in short-term predic- 
tion at this time because the main purpose is to motivate the professional 
to undertake the continuing education he needs or wishes. To say that 
persons who do well in the self-assessment exam will also perform well on 
the job seems like a tall order. There are too many other necessary 
conditions that must be met before undertaking such predictions. 

The concept of validity conjures up notions among researchers of 
being, able to draw justified inferences (internal validity) and to gen- 
eralize beyond the present circumstances (external validity). Trans- 
lating these notions to this exam, one asks: Is success in the exam 
attributable entirely to possession of the appropriate professional 
knowledge and skills? We know that this is not necessarily true. Lack 
of test-wiseness, mental stress, preoccupation with other matters, and 
physical status can all affect how one fares in a test. Fortunately, a 
self-assessment exam reduces many of these usual test performance 
problems because the exam can be taken at one's leisure and without 
time pressures. However, the problems of general izabil ity remain unless 
the test item and the expected performance standards are fair to all 
individuals in the profession. Problems of general izabil ity can occur 
because of differences in practice characteristics, or because of per- 
sonal interests and motivations. Because of the recognition of these 
problems, this particular self-assessment has planned a three-way approach 
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to deriving continuing education prescriptions, namely, practice (P) 
characteristics, personal interests or needs (N), and examination-derived 
indices (E). 

This three-way analysis ensures relevance and completeness to the 
self-assessment. Too often such assessments are limited to tests and 
they fail to recognize the importance of the other critical forces that 
impinge on the decision to seek continuing education* Perhaps a fourth 
factor in the AAPA program lies in the built-in awarding of credit, a 
positive reinforcement that will strengthen the three-way process further. 

We are then left with the matter of usability of the information 
derived from the self-assessment. This project has developed a computer- 
generated four-page report which provides c ^swers to the following types 
of questions (See Exhibit J): 

1. How did I do on the test as a whole? 
(Total Test Score) 

How do I compare with other PAs? (Mean 
and Standard Deviation for a national 
sample of PAs, Page 1 ); 

2. Which items did I get wrong? 

What was the right answer? (Print-out of 
the individual's responses with correct 
answers keyed to each response, Page 1); 

3. How did I fare in specific areas pertaining 
to my role as a PA and where am I expected 
to perform satisfactorily? (Print-out of 
scores on 17 Role scales compared with max- 
imum possible score and minimum competency 
score, Page 2); 

4. How did I fare in terms of my knowledge and 
skill relative to the major Body Systems? 
(Print-out of scores on 28 System scales 
compared with maximum possible score and 
minimum competency score, Page 3); and 

5. How do my exam scores (E) compare with my 
personal need scores (N) and with the 
clinical practice (P) I am engaged in? 
(Graphic display of the P, N, E converted 
scores for each of the 28 Body Systems 
Scales, Page 4). 
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Accompanying this four-page report is an Interpretive Leaflet which 
helps the individual PA understand the report and utilize it for his 
self-assessment (See Exhibit K), The leaflet follows the sequence of the 
four-page report an<^ explains now each piece of data might be utilized by 
the PA* Attempts were made to avoid using technical terms except where 
absolutely necessary. Any attempt to talk down to a PA had to be avoided 
as well* Accordingly, some technical terms, such as reliability and 
standard error of measurement, were retained. These are critical to the 
proper interpretation of test scores and have become part of common 
scientific language. 

Several technical questions will probably arise at this time: 

1, Is it technically correct to report raw scores? 
Are they valid measures of the constructs they 
represent?; 

2, What about the reporting of Mean and Standard 
Deviation of the PA national sample? Isn't this 
going back to norm-referenced testing? How rep- 
resentative was the sample?; 

3, How was the minimum competency score derived? 
How do I interpret my own score relative to 
this standard? How reliable and valid is this 
comparison?; and 

4, How were the graphic indicators of P, N, and E 
derived? Is this comparison appropriate for 
criterion-referenced testing? 

The use of raw scores in reporting test results has become quite 

commonplace in recent years. Interest tests like the Ohio Vocational 0 

Interest Survey (D'Costa, 1969), the Strong-Campbell -Interest ^ 

Inventory (Campbell ,1972) , and others have preferred to use the 

raw score instead of a group-referenced score like the standard 

score because it is recognized that the major interest in comparisons 

lies within the individual's own system of needs and preferences. Often 

the individual wishes to compare his strength in one area with another area 
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without reference to his relative standing in his peer group, A 
dermatologist does not care to know that his knowledge of skin diseases ^ 
is superior among all physicians , He does care for comparisons with 
other dermatologists, but more importantly he is interested in knowing 
his strengths and weaknesses relative to himself alone in order to 
determine what thrust his practice should take. 

The matter of validity of raw scores is then based upon their 
relevance for the type of decisions that must be made. This is partic- 
ularly true for criterion-referenced tests since the major part of the 
interpretation of such tests is located in the criteria or constructs 
being measured. Given the fidelity tonsriteria inherent in the test- 
development process, such interpretation therefore becomes technically 
appropriate and defensible. However, it must be noted that the main 
interest is not in the precise differences between two scores but in their 
relative distance from their minimum competency scores which serve as 
their points of reference. Compare this with the use of group mean scores 
as the points of reference, and it becomes evident why the raw score 
ccupled with its minimum competency score is appropriate for self-assess- 
ment. 

The reporting of the mean and standard deviation of a national 
sample is essentially to satisfy individual curiosity and make the self- 
assessment nore interesting. and meaningful to some individuals who must 
use this type of indirect peer pressure to motivate themselves. This is 
a norm-referencsd technique and its validity depends upon the representa- 
tiveness of the national sample of PAs, 

The norm-group in the Second Try-Out is a sample of 108 PAs who 
voluntarily responded to the self-assessment exam as of July 1, 1979, out 
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of some 274 PAs who were selected on the basis of a ; stratified random s 
sample of the AAPA membership. Effort was made to represent various geo- 
graphic locations and practice 1 characteristics. Excluded from the sample 
were persons who were involved in the test-development process* In terms 
of number of years in practice, the responders were distributed as 
follows: ■ 

Less than'l year % ' 20 percent, 

One year or little more 4 percent, 

' About twd years o * 18 percent, 

About three or four years 25 percent, and 

More than four years 34 percent* 

In terms of type of^patient care provided, the^ responders were 

distributed as follows: - * 

Medicine (family, general, internal) 78 percent, * 
Surgery 20 percent, 

Pediatrics " 2 percent, and 

• Obstetrics/Gynecology „ 1 percent. 

The above data has been rounded off to the nearest integer and , 

hence the totals add to 101 . No data is currently available by which 

to judge the representativeness of this sample with respect to the PA 

population. Inforfnal opinions indicate that the two distributions are 

not surprising. Until a national profile of PAs is developed and be- 

comes available it is difficult to make such comparisons. However, the 

sample does Include PAs with a wide range of practice experience. Also, 

it is encouraging to note the large percentage of general ists in the 

0 

sample. 

The computation of the Minimum Competency Score is based upon 
the Nedelsky *chnique. In essence it amounts to recognizing that in 
a fiye-option _jst, a minimally competent professional must be capable of 
rejecting some of these options because he immediately recognizes the 
erroneous thinking in them. If three options are expected to be rejected, 
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then only the remaining two options serve as legitimate distractors and 
the expected score on this item is therefore 0.5. Using this approach 
with each item, it then becomes 'possible to compute the expected score 
on the total test for a minimally competent PA. The term "minimally 
competent PA M is intended to indicate basic or expected level of skills 
for an entry-level general ist PA. 

The judgments are made by experts who in this case were the 
members of the Test Specifications Committee; two are physicians, 
and three are PA educators or PAs. Each rating was made independently 
and later discussed at Committee meeting in June. 

Technically, a standard such as a minimum competency score is 
treated like an absolute cut-off score. You have either attained it 
or you have not. In self-assessment, there is no need for such absolut- 
ism and its concommitant hazards. For every test score, measurement 
experts point to a standard error of measurement. It is recognized that 
\one's true score may lie within a range of two standard errors in about 
two-thirds of the cases. Thus if one's score is 200 and the standard 
eVror is 7, the true score may lie within 193 and 207. Wider ranges are 
prescribed if greater confidence, or a lesser margin of error, is required. 

\ It is therefore recommended that a raw score distance from a com- 
petency standard be interpreted in terms of the standard error of measure- 
ment. If your raw score is 200 and the competency /Standard is 205, while 
the standard of error of measurement is 7, it should be recognized that 
here the. discrepancy is not large enough to cause worry about one's com- 
petence. As such a discrepancy approaches 3 or more multiples of the 
standard error of measurement, serious concern should occur. 
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In the case of the results of particular examination, it must be 
recognized this exam is still in its Try-Out stage of development and 
the current process of revising items and validating the scales must be 
moved forward before greater confidence can be placed in the scores. 
Likewise the minimum competency scores might need revision based upon 
reaction /fTof^PAs-in-practice to them* 

The reliability of discrepancy scores has been a thorny problem 
in measurement because of the large error statisticians associate vfith 
them* Where reliability is weak, it is difficult to get good validity 
as well* Indeed reliability is a prerequisite for validity to occur. 
The discrepancy data provided in the report is therefore not to be inter- 
preted literally but in context* The questions to ask are: Is this 
really true of myself? Is there a difference in the way I think through 
such problems compared to my peers? Is this important to me, as a pro- 
fessional, i'n my practice? Where can I get more information and assis- 
tance? 

At the current time, the AAPA might be able to identify and offer 
a few CME programs to PAs. interested in following up their self -assessment 
reports* The ultimate objective is to link CME module recommendations to 
the self-assessment so that effective follow-up is possible. It is hoped 
that a modular system of learnin^packages can be developed so as to 
have proper linkages with the Role Delineation. 

The graphic indicators of P, N, and E scores are based on standard 
,scores which are technically known as "linear stanines 1 '. Stanines were 
popularized by testing programs in World War II days. The word "stanine" 
is derived from the standard nine points of reference utilized in this 
technique. In norm-referenced testing, stanines are constructed so as to 
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be interpretable in terms of the normal curve, Lineajf^^nines do not 
assume a normal curve arid are based on a simple linear transformation of 
scores so that the new mean and standard deviation are always 5 and 0,5 
respectively* 

By converting all ^the scores used in the graphic display of P, 
N, and E into the stanins system we fcbtain comparability both within the 
P, N, and E indicators for one Body By stem Scale and between Body System 
Scalts. All the indicators can be ctinpared with one another because they 
have been transformed into this stanine systenf. This capability to com- 
pare indicators is the crux of this P, N, and E. report* The intent is to 
allow the indivia'jal to make comparisons and check them out in terms of 
his own internal beliefs about himself. Internal comparisons serve as 
stimuli rather than indictments about oneself and become the essential 
core of self-assessment. 

The appropriateness of group or norm-referenced information in a t 
criterion-referenced test is not a technical concern among experts like 
Popham (1976) who noted that normative information often provides addi- 
tional insights into what should constitute an acceptable level of per- 
formance* The power of a criterion-referenced test lies inherently in 
its ability to describe what the individual can do, and the addition of 
normative data adds to these insights* On the other hand, norm-referenced 
tests by themselves cannot provide such individual descriptions and are 
therefore weak as diagnostic tools. 

What kind of comparisons can an individual make with P, N, E 
indicator^ in the computer-generated report and how does one go about 
deriving them? (See the Interpretive Leaflet for specific suggestions 
for comparisons). For these insights it must be recognized that the 
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indicators are not to be used as precise measures* Measurement errors 
are of concern here too, although the scale is much reduced. The indi- 
cators are drawn proportionately in nine different lengths to represent 
the nine stanines. In norm-referenced interpretations, 4, 5, 6 are 
considered average, while 1, 2, 3 are below, and 7, 8, 9 are above 



average. 

Technically speaking, linear stanines are subject to aberrant 
values when the distribution of scores is markedly skewed positively or 

negatively. The computer program was designed so that computed stanine 

values in such situations did not go below 1 or above 9. This approxi- 
mation was introduced for the sake of reporting convenience. 



I. A Review of Technical Limitations and Deficiencies 

The self-assessment examination for PAs, as currently developed 

by the AAPA under this contract, has several limitations which need 

to be acknowledged: 

i) The quality of the test item styles needs to 
be improved* This criticism has to do with the 
limited test item styles utilized by the item 
writers* More items need to be written involving 
graphs, charts, and scientific tables. There are 
too many K-type (multiple completion) test items 
and many of these do not take advantage of this 
particular item format. 

ii) The process of item revision needs to be continued 
much farther. The revisions to-date are primarily 
based on one try-out and expert opinions. The 
item analysis from the second try-out must be 
utilized and the^process of revision continued 
until the item statistics, particularly the few 
negative and several very low discriminations 
that still remain are removed. 



iii) The quality of i'-em options (distractors) must be 
improved so that an effective system of error 
patterns analysis can be developed; 
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iv) The scenarios developed must be reviewed for 
critical ity and representativeness of PA per- 
formance. Only a few beginning steps have been 
implemented relative to this interesting tech- 
nique, 

v) There are not enough items identified for tfoe 
item bank. Several scales have less than 10 
items per sc^ 1 _JJiisJljmiiS'-fehe reliability, 
" relevance, and the representativeness of the 
scaTes that have oeen proposed for the self- 
assessment model , 

vi) The scale development process has been entirely 
judgmental. No empirical analysis has yet 
been undertaken to ensure the proper evocation 
of items to scales. or to ensure the homogeneity 
of the scales. Factor analytic approaches are 
available to generate scales for such multi- 
factor test batteries. This criticism also 
applies to the domain definition process. The 
constructs pertaining to the scales are judg- 
mental and lack empirical validity data at this 
time. Some scales may need to be deleted and 
others added as the entry-level general ist PA 
role gets better defined, 

vii) More developmental studies are needed before 
the diagnosti 'al i ties of this examination 
can be conside satisfactory in terms of 
professional measurement standards. In par- 
ticular, deficiencies are noted in terms of 
reliability and validity data for the various 
scales. Many of the scales appear *to have 
very weak reliabilities at this time. Sev- 
eral studies that can be done with the cur- 
rent data have not yet been done, 

vii!) Although the content for this Report was in 
part derived from the JO-point checklist 
provided by Hambleton and Eignor (1979) for 
rating criterion-referenced tests, it must 
be recognized that the weaknesses, as noted 
above, limit the quality of this Report as a 
Technical Manual for the Self-Assessment 
Examination, 
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IV. USING THE SELF-ASSESSMENT EXAMINATION TO DEVELOP 
A PILOT CME SYSTEM FOR PAs 



A. Rationale 

The AAP/\ subscribes to a life-long system of continuing educa- 

tion for its membership in order to assure the prestige of its pro- 

* ) 

fession and to ensure its role in providing quality health care to 
society* 

The development of a national continuing education system for 
PAs entails several issues, of which the following appear to be the 
most critical to the PA profession at this time: 

1. What kinds of CME needs do PAs have? 

Is there some patte>n to these needs in relation 
to practice length (time elapsed since certifica- * 
tion), type of practice (especially supervising 
physician speciality), geographic setting, prac- 
tice location?; 

- 2. How are these CME needs related to performance 
needs? Will the CME proposed result in the 
desired quality of health care?; 

3. Are PAs aware of their CME needs? What kinds 
of CME do PAs. normally seek? How much?; 

< 

4. How do PAs obtain their CME at this time? What 

, approaches seem popular, valued, disliked? What 
kinds of CME programs are currently available to 
PAs? How good are they in terms of meeting the 
needs of the profession?; and 

5. How can a national CME system be developed so 
that a life-long (graduation gown to grave) 
competency assurance program is available and 
utilized by most PAs? 

Logistically, the identification of CME needs of the PA pro- 
fession can be effected by using self-assessment examination data- 
However, this will require a major national- effort because it goes way 
beyond the usual professional survey in depth, although somewhat similar 
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in extent. It is imperative that the approach to PAs'be made in non- 
threatening terms and with sufficient utility offered to make their 
participation possible and worthwhile. With participation time and 
data-gathering costs becoming increasinglyominous , it is necessary to 
come up with innovative approaches to data-gathering, which will fit 
into the professional style and schedules of PAs and yet satisfy the 
needs of statistical inference and generalization. 

Theoretically, a national profile of PAs n^ust be valid. The 
domain on which the profile is generated must be relevant and acceptable 
to the profession. Given a new and developing profession this task is 
not easy, Giv.en the role of the supervising physician in the role of 
all PAs, it becomes necessary to recognize this fact in the pro.cess of ■> 
domain definition* but without diminishing the stature of t|ris group as 
a profession. The changing pattern of health care services ^ this 
country further complicates this task. With national health insurance 
looming not too far off, the PA is bound to be called upon to modify 
his role relative to this national health care need. As more of the 
new type of -allied medical professionals are ushered into the health care 
system, role changes and new responsibilities will occur. 

Yet, within this dynamic system an assessment of the quality of 
the profession appears very much in order. Considerable public invest- 
ment has gone into the creation of this new professional. Expectations 
remain high and it is therefore legitimate to embark upon a reasonable 
effort to provide accountability data to the public. 

Professionally, there is nothing more challenging than the oppor- 
tunity to demonstrate that this young, PA profession is conscioi.- of its 
responsibilities and willing to do whatever is necessary to maintain^ 
quality in its ranks. The AAPA has already embarked upon several 
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continuing education programs and would welcome the opportunity to 
organize the necessary national effort and recruit membership support. 

In generating new programs for the profession, however, the AAPA 
must face the fact that it is responsible to each individual member. 
The national interest in the quality of the profession must therefore 
be based upon the natural interest of each member to remain a worthy 
and useful member t)f the profession. 

o 

B. Methodology Proposed 

The next step in developing the CME system for PAs should take 
the form of a national pilot program founded upon the theoretical, 
logistic, and professional considerations, discussed above. 

Products that need to be developed in the theoretical domain 
were "brainstormed" by the Test Specifications Committee at both .ts 
meetings. The major need is to relate professional responsibilities 
to professional performance. The task is to find the linkages between 
how PAs handle their clinical responsibilities and the quality of 
their knowledge, skills, and attitudes relative to these same respon- 
sibil i ties. This calls for an indepth analysis of clinical performance 
along with an indepth diagnosis of knowledge, skills, and attitudes. 

Several strategies are available to the AAPA for implementing 
such an analysis of the causal dimensions of professional performance. 
Included would be \he selection of a few representative PA clinical 
training programs so that performance assessments can be made by clin- 
ical supervisors, and the skills, knowledge, and attitudes assessments 
can be handled by a revised national self-assessment examination. 

Needed for such an analysis are cl inical , assessment tools and 
revised forms of the self-assessment examination. It would be 
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imperative to ensure that the domains assessed by the two types of 
assessments are compatible, and that they in turn relate to the Role 
Delineation for Physician Assistants, lije adaptation of the Role 
Delineation so that it might better fit such a CME model was already 
begun in this current project* The eleven areas of responsibility 
were reclassified into three comprehensive areas of competence—pro- 
fessional, interpersonal, and clinical. Each competency area indues 
three or, more of the T)rtgirra*l eleven areas of responsibility. The 
"clinical ,s area, however, can also be subdivided by Body Systems. The 
17 Role Scales developed in the Self-Assessment Examination are based 
upon this adaptation of the Role Delineation. 

The development of this model is far from complete at this time. 
There is need for further classification so that the skills identified 
in the Role Delineation are better represented and assessed both' in 
the Self-Assessment Examination and in the set of clinical assessment 
tools that must be assembled. 

The implemention of such a project, from a logistics standpoint, 
calls for a national effort with collaboration of selected PA training 
programs. It would not be difficult to gather data from PAs-in-training 
for such a project. However, it would be unwise to limit these "causal 
analyses" to such groups alone. The need to generalize sucfr findings to 
practicing professionals requires that the major study be conducted with 
practitioners. The training programs would only be involved in the 
"causal analysis" component of the project. 

Aside from generating the basic causal model, the project should 
aim at identifying the major CME needs of PAs and setting up a pilot 
model for implementing an appropriate CME program for PAs. This calls 
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for three other components to the project: development of self- 
assessment examinations, obtaining data for a national PA compe- 
tency profile, and development of CME learning packages* 

The national HA profile would essentially amount to a CME needs 
assessment* It would serve as the basis for emphasis in the develop- 
ment of learning packages* The learning packages would be modularized 
and geared to the self-assessment examination* The self-assessment 
examination would also be modularized so as to make it convenient to 
take, receive feedback, and follow-up by CME* Appropriate linkages 
would need to be developed so that the causal model is operational ized 
and thus CME is given the chance to result in better professional 
performance. 

The professional considerations in implementing such a project 
require that it receive the support of the professional organizations 
concerned and of their membership* Rather than initiate a massive new 
effort with all the concommittant hazards and start-up costs, it would 
be prudent to work into existing professional CME systems and available 
CME mechanisms without getting overly bogged down- in them* This pro- 
ject would need their support but not necessarily their burdens* The 
two, however, do not always exist separately. 

Opportunities, such as national meetings, currently available 
expertise, interests, and products, should be taken advantage of. It 
must particularly be recognized that other health professions may ha '3 
already dealt with some of these problems and that such know-how is 
transferable at lesser cost of time, money, and people. 
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C. Time Schedule 

It is estimated that the schedule for the implementation of 

this proposed CME pilot system would require about three years with 

achievements targetted approximately as follows: 

Year 1 Develop all nBeded^al^TirlTiding their 

try-out. 

Work with PA training programs to establish 
methodology* # 

Develop test item bank for self -assessment 
exam. 

Survey CME approaches, methods, and offerings. 

Year 2 Conduct try-out of the national profile of 

PAs. 

Develop strategies. 

Develop and try-cut* learning packages. 

"Develop feedback system. 

Year 3 Complete nationa.1 profile. 

Develop learning packages; modify packages. 
Use CME feedback system on pilot basis. 
Evaluate and recommend CME system. 
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V. FINDINGS AND CONCLUSIONS 

The purpose of this component of Contract HRA 231-76-0053 was 
to develop a criterion-referenced self -assessment examination for 
physician assistants, using the Role Delineation as the basis, so that, 
appropriate learning prescriptions could be developed in order to 
facilitate the continuing medical education of physician assistants and 
thereby ensure the quality of health care provided by them, 

A 300- Hem comprehensive examination was proposed for develop- 
ment using two Working Committees consisting of PAs and PA educators. 
The Test Specifications Committee provided general guidelines for ^he 
development of the examination and the Item Writers Committee did the 
major work of writing and revising the test-items generated by the 
project. The examination has undergone two try-outs and has been re- 
vised each time, but additional revision is planned with the extensive 
data n^w available. 

The self-assessment examination is based upon the domain defined 
in, the Role Delineation for PAs. Two sets of scales have been generated, 
thereby allowing two approaches to the specification of the domain. One 
set of seal jBS, Role Scales, is based upon an adaptation of the eleven 
•major responsibilities of the PA. The other set of scales, Body System 
Scales, is based upon the matrix comprising Body Systems and Medical 
Intervention types. The 315 items that were administered as part of 
Form A of the Self-Assessment Examination were assigned with the help 
of expert judgment to each of the two sets of scales and scares were 
generated. 

The interpretation of the scores obtained on the ,17 Role Scales 

and the 28 Body System Scales is done with the help of minimum competency 
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scores which were determined by expert judgments using the Nedelsky 
Technique, it must be acknowledged at this point that this effort 
needs additional data-gathering and development. 

Two jomewhat innovative approaches were used in the implementation 
of this project. One has to do with the process of item generation 
where the critical incidents approach was used. The other relates to 
the conceptual model for CME using self -assessment examinations. It is 
the thesis of this model that CME must be based upon a combined analysis 
of practice (P) requirements, individual felt needs (N), and deficits 
"identified by examination (E) scores. In accordance with this model a 
four-pag? computer-generated reporting system was developed and returned 
along with an Interpretive Leaflet as feedback to PAs who participated 
in the second try-out of the examination. 

The test-development process has been constrained by time but 
has nevertheless attempted to adhere to the professional standards pre- 
scribed by measurement specialists (APA-AERA-NCME Standards). The stan- 
dards identified for criterion-referenced test development by Hambleton 
and Eignor (1979) were also recognized in this project. Several limita- 
tions have been acknowledged relative to the "quality" of the test items 
in the current form of the examination. However an overall assessment 
of the examination must acknowledge not only its future potential but 
also several current good qualities. 

Although this Report must serve also as the Technical Manual for 
the Examination, limitations must be acknowledged in this respect. Not 
all of the research which should be contained in such a Manual has been cQm- 
pleted. Nor has there been sufficient time and opportunity to assimilate 
all the data and analyses available to-date in order to provide a good 

r 
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discussion of the tables currently included in the Exhibits, It is hoped 
that the inquisitive reader will use the appended data and direct • 
comments and inquiries to. the AAPA so that the needed/ technical reports 
can be developed and added at a future date* 

The Products generated by this project have been listed elsewhere; 
however, by way of summary, it needs to be noted ^that the AAPA now has an 
Item Bank in the process of development, and a feedback system for 
PAs taking the self-assessment examination which should serve as a nrst 
step toward? their continuing education* The suggestions given in this 
Report for a Pilot CME System are based upon the experiences generated 
in this project and upon a national perspective of the expected directions 
and needed next- steps for the physician assistant profession* 

Finally, the reader might ask: How has this project been evaluated 
by PAs? Anonymous comments were received from PAs who participated 
in the First Try-Out. Most of these were favorable and indicated that the • 
membership was pleased with this undertaking. Formal data was gathered from 
the 108 .PAs who took the entire three-section self-assessment examination. 
We were concerned because this was, a randomly selected group who had not 
volunteered for this imposition nor was it possible to assure them any CME 
credit because approv^H^ad not yet been received for such credit at that 
time. (The self-assessment examination has since been granted six hours 
of Category I credit, and this good news will be added to the feedback 
package that will be mailed to each PA in early September). Table 10 
presents the evaluative data that has been summarized from Items #59, 60, 
61, and 62 of Section 3 of the Self-Assessment Examination. 
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The following summary conclusions and recommendations can- be 
extracted from those data: 

3 ' 

L The self-assessment examination required 
. about 6 hours of work from the PAs who 
participated -in the Second Try-Out; 

2/ The validity of the exam (using weights of 
; . 1, 2, 3, 4, 5 for the 5 ratings and averSg- ' 

ing them) was rated at between Good and 
Satisfactory (3,5) ; 

3* Only 7 out ( of the' 102 PAs who responded to 
this item res-ponded with a "maybe" to Jthe 
question whether AAPA should continue its 
efforts developing such self-assessment 
exams. About 75 percent said "Yes, very 
useful" and 2,0 'percent said "Yes, somewhat 
useful". (It must be recognized that 
these ratings occurred without any feed- 
back being received, and after the, com- 
pletion of a somewhat arduous task); and 

4. About 78 percent requested as much feedback 
as the AAPA could afford to send them. This 
- question also had two interesting response 
options indicating "specialty" comparisons 
versus all PAs 1 comparisons. It is note- 
worthy that 21 want additional comparisons 
relative to their own specialty whereas 
only two want comparisons with all PAs only. 
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TABLE 10 
EVALUATIVE DATA 



374- Appro xiaately how many hours did this exaa 
require of you? 



35 


A. 


Less than 6 








43 


B. 


About 6, but 


less 


than 


7 


16 


C. 


About 7, but 


less 


than 


8 


6 


D. 


About 8, but 


less 


than 


9 


3 


E. 


About 9 hours 


, or 


more 




5 




Blank 









376. Should the AAPA continue its efforts to 
improve this form and to develop new \ 
self-assessment exams like it in the 
future? 

76 A. Yes, this would be very useful 
19 B. Yes, this would be somewhat useful 
7 C. Maybe, but I'm not too sure 
0 D. No, it is not very useful 
0 E. No, it is a real waste of time 

6 Blank 



375. Recognizing that this Form A is still 
soaewhat new, how valid (in terms of 
content, quality of questions, and 
type of examj would you consider this 
exam to be as a self-assessment device 
for PAs? 



1 A. 
15 B. 
40 C. 
33 D. 
14 E. 

5 



Very poor 
Weak » 
Satisfactory 
Good i 
Very good 
Blank , 



377 How much reporting of results (feedback) 
would you like to receive regarding your 
performance on this exam? 

79 A. As much feedback as the AAPA can afford 
to send me 

21 B. Information about my own performance 
with comparisons to all PAs, and 
especially PAs in my own specialty 
2 C. Information about my own performance 

with comparisons to all PAs only 
1 D. Just my own raw scores and subscores 
0 E. I don't care to receive any feedback 

5 Blank 



/ * 



EMC 



95 



74. 

REFERENCES 



Bloom, B.S., ed. Taxonomy of Educational Objectives: Cognitive 
Domain . New York: David McKay, 1956. 

J 

Campbell, David* Strong-Campbell Interest Inventory . Mimeapolis, 
Minn.: University of Minnesota, 1972. -> 

Cohen, 0 < "A Coefficient of Agreement for Nominal Scales," Educational 
and Psychological Measurement , 1960, 20, 37-46. 

j , ' 
D' Costa, Ayres et al. Ohio Vocational Interest Survey . New York: 

The Psychological Corporation, 1970. 

D' Costa, Ayres. T he Longitudinal Study of Physicians . Technical 

Proposal funded by NCHSR, Association of American Medical Colleges, 

It 

Eignor ,*6.R. and Hambleton, R,K. "Effects of test.length and advance- 
ment score on several criterion-referenced test reliability and 
validity indices." Laboratory o^f Psychometric and Evaluative 
Research Report No. 86. AmherstT^Mass. : School of Education, 
University of Massachusetts, 1979. * . 



Engel, J,D, "A Comparison of Diagnostic and Certifying Examination's." 
American Journal of Medical Technology , 43, 5, DeCy 1976, 436-439. 

Flanagan, John C. "The Critical Incident Technique." Psychological 
Bulletin , 51, 4, July 1954, 327-359. 

Gagne, R.M. The Conditions of Learning . New York: Hoi t/ Rinehart, 1970. 

Hambleton, R.K. and Eignor, D.R. "Competency Test Development, Validation, 
and Standard Setting." In R. Jager and„C. Tittle, eds., Minimum 
Competency Testing . Berkeley, Calif.*: McCutchan, 197,9. 

Hambleton, P and Eignor, D.R. Criterion-Referenced Test Development 
and Validation- Methods , AERA Training Program Materials, 1979. 

Hess, K.M.'and Morreau, L.E. "The Expanding Classroom. . .Why 

Self-Assessment?" Postgraduate Medicine , 59, 1, Jan. 1976. 203-210. 

Krathwohl,, D.R., Bloom, B.S. and Masia, B.B. Taxonomy of Edu c ational 

Objectives: Handbook II: Affective Domain . New York: David McKay, 
T964: ~ 

Marsland, Wood,,M., and May, F. "Content of Family Practice." 

Journal of Family Practice , 3, 1976, 37-45. 

Milgrom, P., Weinstein, P., Ratener, P., and Morrison, K. "Dentists Self- 
Evaluations: Relationships to Clinical Performance." Journal of 
Dental Education, 42, 4, 19*78, 180-185. 



96 



/ 

/ 



75 



Hi 11 man, J. "Criterion-Referenced "Measurement. " In W.j. Popham, 'ed., 
Evaluation in Education: Current Applications . Berkeley, Calif.: 
McCutchan, 1974. 

Nedelsky, L. "Absolute Grading Standards for Objective Tests." 

Educational and Psychological Measurement , 14, 3-19, Spring 1954. 

Perry, Henry B. "An Analysis of the Professional Performance of 

Physician's Assistants." Journal of Medical Education, 52, Auq. 1977, 
639-647. ' ; . 

Popham, W.J. Criterion-Referenced Measurement . Englewood Cliffs, N.J.: 
Prentice-Hall, 1978. n Z - 

■ ' ' 

•Pottinger, P.S. Competence 'Testing as a Basis for Licensing:. Problems 

and Projects . Paper presented at Conference on Credential ism, 
Berkeley, Calif., April 1977. ^ 

Subkoviak, M. "Estimating reliability from a single administration of a 
criterion-referenced test." Journal of Educational Measurement, 
1976, 13, 265-275. 

Wilson, Margaret A. "Basic Principles of Credential 1 ing Health 
Practitioner." Respiratory Care , 21, 10, Oct. 1976, 954. 



EXHIBIT A 



Unprioritized List of Program 
and Research Objectives 



98 



EXHIBIT A 



Unprioritized List of Program and Research Objectives 



Program Objectives 

1. Develop appropriate baseline for 
recertifi cation exam. 

2. Profile PAs national ly/CME needs 
assessment. 

3. Provide individualized feedback. 

4. Evaluate PA training programs. 

5. Exam matches clinical practice. 

6. Learning packages to improve 
competence. 

7. Use log diary to validate test. 

8. Individuals' learning styles to 
provide CME. 

9. Competence improvements as result 
of learning packets. 

10. Collect critical incidents and 
skills and relate to curriculum. 

11. Sensitivities of PA profession 
to tost items. 

12. Item analysis. 

13. Key word analysis. 

14. Two equivalent forms of the 
test. 



Research Objectives 

1. Compare option formats. 

2. Compare item styles. 

3. Strengths/weaknesses by geographical 
location, training program, specialty. 

4. Self-expressed competence and 
successful recertification. 

5. Test specifications matrix. 

6. Pragmatic research model to research 
profession. 

7. Causal dimensions of competence. 

8. Relationship between competence areas 
and tasks frequently done. 

9. Develop item pool . 

10. Identify core performances. 

11. Longitudinal changes in PA 
population on all parameters. 

12. Profile of test components (content). 

13. Referenced feedback. 
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EXHIBIT B « ^ 
The Eleven Areas of the PA Role Delineation 



EXHIBIT B 

THE ELEVEN MAJOR RESPONSIBILITIES OF THE PA ROLE DELINEATION 



RECOGNIZE INTERDEPENDENT RELATIONSHIP WITH SUPERVISING PHYSICIAN 

A. Accept that the role of physician assistant ic limited 

B. Resist compromises in the practice of medicine when 

conflicting with professional ethics 

C. Express professional opinion on matters of patient care, 

even if different from supervising physician's opinion 

D. Express limitations of the role when necessary 

-DEMONSTRATE PROFESSIONAL BEHAVIOR 

A. Possess attributes of empathy, objectivity, tolerance, 

confidence 

B, Demonstrate professional attributes in actions 
PROMOTE PREVENTIVE HEALTH CARE 

A. Educate patient and family concerning health care 
measures 

• B. Perform screening examinations * 
C« Provide sex education * 

D. Provide counseling to patient and family 

E. Provide resources' for patient education 

•* 

ESTABLISH HEALTH "STATUS DATA BASE 

A. Modify data gathering process as necessary 

B. Elicit pertinent medical and psycho-social history 

C. Perform physicial examination as pertinent 

D. Establish preliminary diagnosis of common problems 

E. Obtain information fropi screening and diagnostic tests 

by ordering and performing tests and obtaining specimens 

F. Record and transmit findings from history and physical 

examination 

G. Inform physician of tentative problemlist 
ANALYZE DATA BASE 

A. Differentiate between normal and abnormal (including 

variations of normal) information contained in the 
data base 

B. Interpret raw data from screening and diagnostic tests 

C. Interpret written report of screening and diagnostic tests 

D. Validate preliminary diagnosis of common problems 
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V, ANALYZE DATA BASE (continued) 

E. Develop diagnostic impressions from information contained 

in the data base 

F. Establish working diagnosis of common problems 

G. Confer with supervising physician according to practice's 

guidelines * 

VI. FORMULATE HEALTH MANAGEMENT PLAN 

L 

A. Resolve deficiencies defined by data base 

B. Prioritize problems to be managed 

C. Devise plan to coordinate multiple treatment modalities 

D. Select therapeutic measures 

E. Select supportive services to be involved in patient 

care 

F. Describe parameters of patient education relating to 

immediate problems, then others 

G. Formulate a management plan for. common problems 

VII. IMPLEMENT HEALTH MANAGEMENT PLAN 

A. Educate patients and family 

B. Contact supportive services to be involved in patient 

care 

C Provide information pertinent to consultation/referral 

D. Provide treatment of common problems 

E. Refer patients as necessary for treatment of common problems 

F. Initiate medical therapies/procedures 

VIII. MONltOR HEALTH MANAGEMENT PLAN 

r 

A. Assess degree s of patient compliance 

B. Assess progress toward desired result 

C. Determine economic impact of management plan 

D. Determine impact of community resources 

E. Recognize undesirable effects of treatment plan 

F. Redirect patient efforts based upon results of treatment 

plan 

IX. ESTABLISH EFFECTIVE INTERPERSONAL RELATIONSHIPS WITH PATIENTS, 
PROFESSIONALS, AND OTHERS 

A. AJapt suitable interviewing style 

B. Accept personal, cultural, and professional factors 

affecting health 

C. Assist patient/ family in handling/expressing feelings 

D. Recognize changes inpatient's psychological state 

E. Maintain relationship with referred patients 

,/ ' F. Demonstrate concern for patient's privacy, modesty, 
anxieties during the examination 

G. Transmit and record information 
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X. MAINTAIN COMPETENCY 

A. Engage in periodic review of professional skills (self- 

. assessment, etc. ) * 

B. Devise and maintain program of formal and informal CME based » 

upon recognized needs 

C. Acquire knowledge and skil 1 * essential to incorporating into 

practice proven new evaluation/treatment modalities 

D. Maintain en on-going library of appropriate journals and books 
. E. Maintain membership in professional organizations 

F. Obtain/maintain certification as a PA 
6. Critically review the current literature 

XI. PROMOTE ACCEPTANCE Of THE ROLE 

A. Explain role by actions and words' to others 
t B. DispVay sensitivity to the partial overlapping and possible 
sharing of responsibilities with other heal tlr professionals 

C. Ajse formal and .informal conflict resolution techniques 

including adjusting activities, fostering improved working 
relationships, helping behavior * 

D. Transmit reference materials to relevant professionals. 

concerning physician assistant functions and utilization 

E. Assess within the WQrk group the behavior of individuals and 

group actions to facilitate problem solving or prevent problems 
from arising 

F. Know and implement strategies useful in gaining acceptance of 

the role within the community 

G. Give talks to groups interested in the PA concept 

H. Seek out or counsel prospective PA students 
f I. Write articles for local newspaper about the PA concept, 

J.* Submit an article for publication 

K. Initiate contact with other physicians in the area to promote 
the PA concept 

L* Participate in community health programs ^ 
M. Initiate change in routine protocol 
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• EXHIBIT C 
Frequency Distribution of Total Scores 
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EXHIBIT C 

FREQUENCY DISTRIBUTION OF TOTAL SCORES 
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Sample^* ze = 108 
Mean score = 188 
Minimum competency score = 171 
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Means, Standard Deviations, and 
Minimum Competency Score for the 17 Role Scales 
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- * -.. EXHIBIT D . 

• *>Role Scale. - 



Sub* litems 


• MCS 


Mean 
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EXHIBIT E 



■ Means, Standard Deviations, and < 
u Minimum Competency Score *f or the 28 System Scale s 
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EXHIBIT E 
System Scales 
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EXHIBIT F 

Distributions of Item Difficulty and Discrimination 
Indices for Total Test (Second Try-Out)" 
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EXHIBIT F 



Item Difficulty Distribution (Second Try-Out) 

Range Number of I terns Percent 

0.81 to 1.00 34 11 

0.61 to 0.80 55 17 

0.41 to 0.60 66 - 21 

0.20 to 0.40 86 27 

0.00 to 0.20 74 23 



Item Discrimination Distribution (Second Try-Out) 

Range Number of \ terns Percent 

. 0.81 to 1 .00 0 0 

0.61 to 0.80 1 0 

0.41 to 0.60 29 9 

0.21 to 0.40 103 33 

0.00 to 0.20 172 55 

Below 0.00 10 3 
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EXHIBIT G 



Reliability, Coefficient of Agreement, and Standard Error 
of Measurement for Total Test and for the 17 Role Scales * 
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EXHIBIT G 

\ 



Reliability, Coefficient of Agreement, and Standard Error 
of Measurement fcr Total Test and tor the 17 Role Scales 
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0.90 


7.41 


1.00 



*Note: The coefficient of Acceptance is the probability 
of the consistency in classifying an individual as compe- 
tent or incompetent on the basis of this test score. It 
is dependent on the minimum competency score and is usually 
inflated by consistencies that occur by chance- 
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Reliability, Coefficient of Agreement, 
and Standard Error of Measurement of the 28 System Scales 



114 ■ 

( 



EXHIBIT H 

RELIABILITY, COEFFICIENT OF ACCEPTANCE AND STANDARD 
ERROR OF MEASUREMENT FOR THE 28 SYSTEM SCALES 
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*Not£: The coefficient of Acceptance is the probability 
of the consistency in classifying an individual as comp- 
tent or incompetent on the basis of this test score. It 
is depe-ndent on the minimum competency score and is usual 
ly inflated by consistencies that occur by chance. 
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EXHIBIT I 



Assignment of Second Try-Out Test Items 
to Original Test Specifications Matrix " 



US 



Role 
Area 



EXHIBIT 1 

ASSIGNMENT OF SECOND TRY-OUT TEST ITEMS TO TEST SPECIFICATIONS MATRIX 



Total 
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1 

Know 1 edge 


2 

Prob 1 «m 
Solving 


3 

1 nterpersona 1 


Total 


1 


Physician lo^terdep. 


1 (3) 


i 




1 ( 1 ) 
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4 
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46 (19) 


29 


(20) 


( 6 ) 


75 


(45) 


5 


Analyze Data 


37 (20) 


68 


(20) 


1(6) 
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(46) 
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Form Health Plan 


21 (20) 
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3 (6) 


59 
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(4) 


1 


(4) 


1 ( 1 ) 


2 


(9) 






130 (129) 


169 (132) 


16 (33) 


315 


(300 ) 



Note: Numbers within brackets Indicate weights recommended 
by Test Specifications Committee. 
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Individual Report 
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/ EXHIBIT J 



AMERICAN ACADEMY OF PHYSICIAN ASSISTANTS 

SELF ASSESSMENT EXAMINATION FORM A 

THIS REPORT IS PREPARED FOR AAPA ID # 

WhO COMPLETED THIS EXAMINATION ON 7/13/79. 



YOUR TOTAL RAW SCORE IS 165 OUT OF 308; YOU OMITTED 2 ITEMS. 
MINIMUM COMPETENCY TOTAL SCORE IS 171.2. 

MEAN SCORE OF NATIONAL SAMPLE OF PAS IS 187.8, WITH STD DEV= 23.7 
PRTNTED BELOW IS YOUR RESPONSE TO EACH ITEM. 

THE COR .ECrt RESPONSE IS SHOWN WITHIN BRACKETS ONLY WHEN DIFFERENT, 
ASTERISK INDICATES ITEM IS NOW DELETED FROM EXAM. 
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PAGE: 2 OF REPORT 


FOR AAPA ID 


NUMBER 






BROKEN DOWN BY ROLE AREA AND BODY 
YOUR EXAMINATION SCORES wOUt r APPEAR 


SYSTEM, 

AS FOLLOWS: 






SCAtE TOTAL ITEMS 


YOUR SCORE 


MIN CI 


1 


PROFESSIONAL ROLE 


10 


6 


5.3 


z 


INTERPERSONAL 6EHAV. 


16 


10 


7.7 


3 


GATHER DATA — RESP,CV 


•* 

22 


8 


12.5 




GATHER DATA~Gf,GU , 


17 


10 


9.8 


z> 


GATHER DATA — PSY,NEl>, . 


15 


10 


8.9 


6 


ANALYZE DATA — ENOO 


7 


5 


3.8 


-> 
7 


ANALYZE DATA-kESP,CV 
ANALYZE DATA— hEMA 


28 


9 


13.7 


8 


5 


2 


2.4 


9 


ANALYZE DATA — GI t GU 


31 


17 


17.1 


1U 


ANALYZE DATA— PSYfNEU 


12 




6.3 


1 1 


FOkM PLAN — MUSC-SKEL 


5 


3 


3.5 


12 


HANDLE DERM PROBLEM 


13 


6 


6.5 


13 


MANAGE PTS—RESPfCV, 


iX 


18 


18.4 




MANAGE PTS — GI P GU 


24 


13 


12.9 


15 


MANAGE PTS — REPRODUC 


9 


7 


5.1 


16 


MANAGh PTS — PSYfNEUR • 


6 


5 


3.2 


17 


HANDLE PHARM PROBLEM 


7 


4 


4.2 
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PAGE 3 OF REPORT FOR AAPA 10 NUMBER 



bROKEN DOWN BY BODY SYSTEM AND MEDICAL INTERVENTION* 
YOoR EXAMINATION SCORES WOULD APPEAR AS FOLLOWS: 



SCALE 
EMERGENCY 

1 CARDIO-VASCULAR 

2 RESPIRATORY 

3 GASTR0-1NTEST1NAL 

4 NEUROLOGY 
ACUTE 

5 CARDIO-VASCULAR ' 

6 MUSCoLO-SKfcLETAL 

7 RESPIRATORY 

8 GA ST RO-INTESTINA L 

9 GENI TO-URINARY 

10 NEUROLOGY 

11 PHARMACOLOGY 
I* DERMATOLOGY 

13 EYES L fcNT 

14 HEMATOLOGY 
4JL-.RE PRODUCTIVE , 
lb PSYCHO-SOCIAL 

CHRONIC 
1/ CARDIO-VASCULAR 

18 MUSCULO— SKELETAL 

19 RESPIRATORY 

20 GASTR0-INTEST1NAL 

21 GENI TO-URINARY 

22 NEUROLOGY 

23 ENDOCRINE 

24 OERMATOLOGY 

25 EYES £ ENT 

26 HEMATOLOGY > 
2? REPRODUCTIVE 

28 PSYCHO-SOCIAL 



TOTAL ITEMS 

8 
7 
10 
6 

15 

9 
16 
23 
18 

7 
12 

9 
12 

8 
12 
11 

6 

5- 
13 
- 8' 

6 
16 

3 

7 

5 

5 

8 4 



YOUR SCORE 



4 
4 

6 
5 

6 
4 
7 
15 

H 
3 
8 
5 
5 
3 
7 
7 

4 
4 

3 
4 
5 
4 
8 
1 
2 
2 
4 
3 



MN COMP 

4.5 
4.1 
6.5 
4.2 

8.5 • 
5.2 
8.7 
13.0 
8.4 
.3.8 
7.8 
4.7- 
6.3 
4.5 
6.6 
5.2 

5.1 
2.8 
2.7 
7.2 
4.2 
4.0 
8.6 
1.3 
3.1 
2.9 
3.0 
3.4 



CONTINUED 



\ 

4 

PAGE h OF REPORT FOR AA"PA ID NUMBER 

\ • • 

COMPARED WITH YOUR PRACTICE (PL- PROFILE (TYPES OF PATIENTS SEEN) 
AND WITH YOUK CME NEEDS (Nf PKOFlLfc (AREAS SOUGHT TO LEARN),* 
YOUR EXAM (El SCORES WOULD SHOW UP AS FOLLOWS: 



EMERGENCY 



1CAROIO-VASCULAR P 3"GASTR0— INTESTINAL P — s 

N N 

E****** < E**** 

2 RESPIRATORY P • , NEUROLOGY P- 1 — s- 

Nw .< s N. ...... 

£***♦ • £****** 

ACUTE CHRONIC * 

5 CARD 10" VASCULAR P 17 CARDlO-VASGuLAR P 

N , „ N 

£.**** y £****** 

6' MUSCULO-SKELETAL P r 18 MUSCULOSKELETAL . P 

N N 

£****** t £****** [ 

7 RESPIRATORY P '19 RESPIRATOR* P 



«. e********** )f e****** ** 

a GAS1 RO- INTESTINAL P 20 GAS! RO— INTESTINAL P— ; 

N , • * N..*... 

E** ' ■T' fc****- 

V GEN ITO-UKl NARY P — 21 GENITO-URINaRY P 

Ni % . ' N 

£**.** E»*** . 

, 10 NEUROLOGY P = 22 NEUROLOGY P 

N "N.. 

e****** 

11 PHARMACOLOGY P 23 ENDOCRINE P 

N..V N 

E**.** * E****** 

12 DEkMATULOGY P 24 DERMATOLOGY. P 

N • • •••••••• .N 

13 LYES L ENT P r— 25 - EYES L Oil P 

N N 

E**** e****** 

1A ttEMAlULOGY P 26 HEMATOLOGY P '■ 

N N..... 

E****** • . E******** 

15 REPRODUCTIVE P 27 REPRODUCTIVE P 

N N 

E****** e ****** 

16 PSrCHO-SOClAL * P 28 PSYCHO-SOCIAL P 

N N 

E*»** E****** 
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EXHIBIT K 



Competency- Based Seif-Assessment for Physician Assistants 

developed by the 
American Academy of Physician Assistants (AAPA) 
Form A, 1979 



HOW TO INTERPRET YOUR REPORT 
(an interpretive leaflet for the physician assistant) 



Introduction • ' 

The self-assessment examination which you recently completed and 
the enclosed computer-generated report were developed by the AAPA under 
Contract No, HRA 231-76-0053 with the guidance of'Ayres D'Costa, Ph,D:, 
Associate Professor, Health Professions Education, The Ohio State 
University, Columbus, Ohio. Major support for the development of the 
exam was provided by an Item Writers Committee and a Test Specifications 
Committee whose members are PA practitioners and PA educators. 

This Interpretive Leaflet is designed to assist you in deriving 
some benefits from the exam. Comparisons between your scores as an 
individual and the scores of PAs as a group are not emphasized, because 
we want the self -assessment exam to be criterion-referenced. Our cri- 
terion is PA competence, and we would like to help you work towards this 
goal. 

In order to assure that the exam does cover the areas of competence 
expected of PAs, we utilized the Role Delineation for the Physician 
Assistant, recently developed by the AAPA, Eleven major responsibilities, 
or role areas, span the realm of tasks which a PA should,, be competent to 
perform. The items included in the self-assessment exam^were considered 
to be the best items available to AAPA for representing the critical PA 
behaviors relative to competent performance in the 11 role areas, (For 
a complete discussion of the item generation and test development process, 
see AAPA's Final Report to HRA on this project, Volume III: Development 
of a Self -Assessment Examination for Physician Assistants.) 

Certain limitations of the exam must be pointed out in interpreting 
your exam results. The exam was under ^development when you completed it 
and is, therefore, subject to further review and refinement. Moreover, 
we have not yet analyzed the extent to which the sample who actually com- 
pleted the exam is representative of the PA population. As the self- 
assessment program' gains in experience, the quality and quantity of the 
test items .available to us will improve, and the results you will receive 
in the future will carry greater credibility. 



Technical Information 

The technical information included in this leaflet is not entirely 
necessary in order to interpret your exam results; Such information is 
provided within brackets [ ] for those readers who might be interested 
in it. 

V 

[The reliability of the total test is 0,90, with a standard error 
of 7,41 . These figures are generally considered respectable by measure- 
ment professionals* However, the attached computer-generated report 
also includes scores on scales (described below) \derived from subsets 
of items from the total test* The reliabilities of the scales can there- 
fore be expected to be lower than the reliability of the total test. The 
reliabilities of these scales are presented in Tables 1 and 2 so that you 
can use the necessary caution in using this data in your self-assessment. 
In a criterion-referenced exam'the coefficient of agreement serves as the 
more appropriate index of reliability. See Tables 1 and 2.] 



How to Interpret the Report 

The computer-generated report consists of four pages. Among the infor- 
mation reported are: your response and the correct response to each item, 
your scopes on scales derived from subsets of items, and a graphic compar- 
ison of your practice characteristics, your CME needs, and your exam scores. 
Detailed instructions on how to interpret each page of the report are given 
below.' 

Hou) to Interpret Page I. Your Total Raw Score is the total number of 
test questions you answered correctly. You may recall that there were 315 
questions on the test. However, as a result of item analyses and further 
review, seven items were discarded from the self-assessment exam and were 
not scored or included in any examination statistics* Therefore, the 
highest score one could attain is 308. 

In order to judge the adequacy of your total raw score, you should 
compare it to the Minimum Competency Score. This score is based upon 
the expert judgements of som6 of the PAs and PA educators who developed 
the exam. The word "minimum 1 ' is intended to convey a basic level of 
competence for an entry level generalist PA. Following the minimum com- 
petency score is the number of items you Omitted on your answer sheet. 

The Mean Score is commonly called the arithmetic average; it indicates 
how the national sample of PAs scored on the exam. You should use the 
standard deviation {Std Dev) along with the mean to appreciate how far the 
scores of the entire sample are spread out from the mean score (i.e., are 
distributed). 

The extensive printouts wijder Section I and Section 2 -refer to the 
test items. Your response is printed after each item number. A letter^ 
in parentheses is the correct response and appears only if you answered 
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Table 1 



v 



3EllA9tll r Y, COEFFICIENT DF ACCE="ANLfc AND STANDARD ER90R 
OF MEASUREMENT c OR THE IT ROLE SCALES ANC TnE TOTAL TEST 



1 


10 


wr.i9 


1.43 


0 58 


z 


lb 


G.48 


l .oZ 


0 97 


3 


C't 


0.5L 


2.CC 


0 63 


H 


17 


0.01 


1.51 


0 75 




15 


0.6J 


1.64 


0 91 


0 


7 


0. Lo 


1.11 


0 84 


7 


23 


Oo7 


4.31 


' 0. 80 


d 


5 


0.23 


0*43 


0 52 


9 


31 


0.32 


2.27 


0 63 


10 


U 


0.33 


1.51 


0.58 


11 


5 


0.31 


0.80 


0 32 


'U 


13 


0.30 


:.50 


0 89 


13 


31 


0.26 


2.27 


0. 78 


I* 


24 


0.19 


2.11 


0 8 


i5 




0.37 


1.22 


0 68 


Lo 


6 


0.23 


1.01 


0 84 


17 


7 


G.H 


i. li 


0 58 


Total 
Test 


308 


0 90 


7 41 


1 00 



*Note: The coefficient of Acceptance is the probability 
of the consistency in classifying an individual as 
competent or incompetent on tre basis of this test 
score. It is dependent on the rrinimum competency 
score and is usually inflated by consistencies that 
occur by chance. 
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Table 2 

RELIABILITY COEFFICIENT OF ACCEPTANCE AND STANDARD 
ERROR OF MEASUREMENT FOR THE 28 SYSTEM SCALES 



Scale 


Sterns 


Reliabity 


SEM 


Coef.of A 


I 


6 


0.30 


1.19 


0 55 


2 


7 


0.31 


1.01 


0.53 


3 


10 


0*22 


1.28 


0.76 


»* 


o 


0.25 


1.00 


0 56 




15 


0.33 


1.70 


0 86 


b 


9 


0.24 


1.31 


0 53 


7 


10 


0.49 


1.74 


0.68 


8 


23 


0.21 


1.93 


1 00 


9 


18 


0. lo 


1. 70 


0.86 


10 


7 


0.23 


1.23 


0 58 


11 


U 


0.32 


J..4S 


0.82 


U 


9 


0. cQ 


U23 


0 53 


13 


12 


0.37 


1.45 " 


0.55 


W 


a 


0. 1* 


1.17 


0 74 


15 


12 


0.3* 


1.40 


0 66 


10 


11 


0.26 


1.42 


0.57 


17 


9 


0.25 


1.24 


0.55 


lo 


6 


0.48 


1.01 


090 


19 


6 


0.08 


0.90 


091 


20 


13 


0.19 


1.59 


0 89 


21 


3 


-O.Oo 


l.Ov 


0.50 


22 


6 


0.39 


0.V7 


0.65 


23 


16 


0.45 


1.64 


0 73 


^4 


3 


-0.01 


0.74 


1.00 


CO 


7 


0.05 


1.19 


0.72 


66 


5 


0.39 


1.00 


0.71 


27 


5 


0.52 


0.86 


0.78 


611 


a 


0.36 


1.17 


0.62 
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the item incorrectly. The items you did not answer are left blank. The 
items that have been discarded from the exam are marked with an asterisk* 

Note the items that you answered incorrectly* You might wish to go 
back to the test questions* (If you no longer have your test booklets, 
write the AAPA for another set*) Do you agree with the correct answer? 
Maybe you don't for a very good reason* We would like to hear your 
reasons and your comments. Remember that our item writers and committee 
members are also PAs like you* Maybe we goofed! Perhaps you agree with 
our experts and would like to undertake some self-learning. We have not 
been able to develop such learning packages yet, but would like to* 
Hearing of your interest will help us. Do write and let us know your 
specific needs* 

Hou) to Interpret Page 2. The 17 scales on this page were derived 
from the Role Delineation for the Physician Assistant. There are 11 role 
areas and several of the areas are subdivided by 13 body systems. At 
this point in the development of items for a self-assessment exam, not 
enough items are available to form a scale for each of the role areas and 
body systems. Therefore, some of the scales on this page represent more 
general areas of competence derived by a meaningful combination of the 11 
role areas. For some role areas, that are well represented on the self- 
assessment exam, it was possible to derive more than one scale so as to 
represent competence within various body systems* At this stage, there 
is a limited number of role area - by - body system scales available* 
The 11 role areas, the 13 body systems, and -a graphic representation of 
these 17 scales are presented in Table 3. You will note that one of the 
11 -role areas on the role delineation promote preventive health care -- 
does not have any scale on the, exam, a*, not enough test items have been 
generated for this area. 

Scale 1, Professional Role, combines three role areas across all body 
systems, namely, recognize interdependent relationship with supervising 
physician, maintain competency, and promote acceptance of the role. 

Scale 2, Interpersonal Behavior, combines two role areas across all 
body systems, namely, demonstrate professional behavior and establish 
effective interpersonal relationships with patients, professionals, and 
others. 

Scales 3, 4, and 5, Gather Data, relate to competence in one role 
area (establish health status data base) separately for three body systems. 

Scales 6 through 10, Analyze Data, relate to competence in one role 
area (analyze the health status data base) for various body systems. 

Scale 11 relates to one role area (formulate health management plan) 
for one body system {musculoskeletal). 

Scale 12 combines five role areas (establish and analyze data base 
and formulate, implement, and monitor health management plan) relevant to 
dermatology problems. 
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Table 3 

MAP OF THE 17 ROLE AREAS X BODY SYSTEM SCALES 



Body 
Systems 



PA 

Role Areas 



I 
X 
XI 



Professional 
Role 



II 1 Professional 
E£j Behavior 



tt}~ Gather Data 
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0 



CO 

o 
o 

CO 
I 

o 

JC 

o 

CO 
CL 



2 3 4 5 6 7 8 9 10 11 12 



o 

CO 

E 
u 

CO 
JC 

Ql 
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Table 4 



MAP OF THE 28 BODY SYSTEMS x MEDICAL INTERVENTION SCALES 



1. Musculo-Skeletal 



2. Dermatology 



3. Endocrine 



4. Eyes & ENT 



5. Respiratory 



6. Card io- Vascular 



7. Hematology 



8. Gastro- Intestinal 



9. Genito-Urinary 



10. Reproductive 



11. Neurology 



12. Psycho -Social 



13. Other (Pharmacy) 
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Scales 13 through 16 {Manage Patients) combine three role areas 
(formulate, implement, and monitor health management plan) for various 
body systems . 

Scale 17 combines the five role areas of Scale 12 relative to 
pharmacological problems. 

Three scores are provided for each of these 17 scales. The first 
indicates the total number of item* included in the scale and is the 
maximum possible score on the scalv The second is Your Score and it 
indicates the number of these items that you answered correctly. The 
third is the Minimum Competericy score which is a tentative benchmark for 
comparison with your scale score. [Do not forget possible problems of 
reliability of scales wicn a small number of items. The reliability 
index and the standard error for each scale are listed in Table 1. A 
reliability of 0.70 is the minimum we would like to. see. However, 
several ^factors, such as number of items and group diversity, affect 
the reliability index. For this reason, it is difficult to provide a 
general rule for interpreting this index.] 

Hew to Interpret Page 3. The 28 scales on this page were derived 
from available items representing body systems as well as type of medical 
intervention. These scales provide another way of interpreting your exam 
results. The 13 categories of body system, the four types of interven- 
tion, and a graphic representation of the 28 scales are presented in 
Table 4.„ You will see from the Table that "Acute" and "Maintenance" are 
combined, as there are not yet sufficient exam items to represent both 
types of intervention. Also, not all body systems are represented rela- 
tive to "Emergency" interventions. Both "Acute" and "Chronic" appear to 
be well represented. Scale 11 does not represent a body system, however, 
sufficient exam items were found relevant to drug information (dosage, 
side effects, etc.) to include a scale for "Acute Pharmacology." 

The format of information on page 3 is identical to that of page 2. 
Again three scores are provided. Similar cautions are urged in utilizing 
this information in self-assessment. The reliability index and standard 
error for each scale are listed in Table 2. 

How to Interpret Page 4. We expect that you will find this page 
very interesting and useful to you in your self-assessment. Instead of 
actual raw scores, page 4 presents a graphic representation of your scores 
on the 28 scales from page 3. Page 4 is arranged to allow you to make 
comparisons among your "practice characteristics (P)," your "CME needs (N)," 
and your "exam scores (E)." The P and N representations were derived from 
your responses to Section 3 of the exam, where questions were of the type 
"How many patients do you see for certain types of care?" (for practice 
characteristics) and "How comfortable do you feel about your performance 
with patients requiring certain types of care?" (for CME needs). 

We believe that page 4 will be helpful to you in planning your CME 
activities. You can make two kinds of visual comparisons on page 4. You 
can compare your practice, your needs, and your exam score on the same 
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scale* Also, you can compare your practice across all the scales to find 
out which types of patient problems you see relatively more and less often 
in your practice. Similarly, you can compare your CME needs and exam 
scores across all scales, 

[Your scale scores on page 3 and their graphic representations on 
page 4 cannot be compared directly, as the latter represent standard scores. 
Likewise, the practice characteristics scores and the CME needs scores are 
standard scores. All of these standard scores have a mean of 5,0, a standard 
deviation of 0,5, and a range 1 to 9, The standard scores are based upon the 
national sample of PAs who completed the exam in June 1979,] 

Here are some suggestions for utilizing the information on this page: 

i) Identify your high need areas. Do these needs relate 
to your practice characteristics? 

ii) Compare your high need areas to your exam scores, 
^ Are there some mismatches? In which. direction 

are there some high needs where exam scores are 
low; or perhaps low needs scores where exam scores 
are high? Perhaps your needs are high even when 
your exam scores are high. Is this indicative of 
your high goals and ambition in this area? 

iii) Are your need scores consistently high? Try to 
understand why. Motivation? Pressure? 



Conclusion 

This interpretive leaflet has merely indicated some approaches to 
using the A£PA Self-Assessment Report, You will undoubtedly wrestle with 
additional questions and come up with other and more imaginative ways for 
using the information in this Report, Once again, we urge caution because 
this program is very much in an experimental stage. 

Finally, the AAPA congratulates you for taking the self -assessment 
exam and hopes that you will find this Report useful. 



l'3Z 



